You are viewing a plain text version of this content. The canonical link for it is here.
Posted to java-user@lucene.apache.org by Jamie <ja...@stimulussoft.com> on 2010/03/19 12:49:57 UTC
Lucene 3.0 Search Performance Stats
Hi Guys
I just wanted to congratulate the Lucene guys for a fine job on 3.0!!
Since we switched our indexes to using integer based range queries based
on Date (YYMMHHSS), search speed is lightening fast and memory
consumption has dropped considerably!
Some stats:
Indexed Docs: 7.2M emails
Index Size: 24 GB (non optimized)
Search Speed: 0.06 - 0.09 seconds (with sort YYMMHHSS date)
Index stored on 4 SAS HDD hitachi RAID 10
16G RAM
2x Xeon 4 core 2.4Gz
OS FreeBSD 7.2
Filesystem UFS2 gjournal
I believe we are using all search performance recommendations now.
Good job!
Jamie
---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org
Re: Lucene 3.0 Search Performance Stats
Posted by Michael McCandless <lu...@mikemccandless.com>.
Looks like the bulk of your RAM usage is from the 370K index terms in
your terms dict...
The flex branch (once it lands) should substantially reduce that...
Mike
On Mon, Mar 22, 2010 at 8:35 AM, Jamie <ja...@stimulussoft.com> wrote:
> Hi Everyone
>
> The stats I sent through earlier were erroneous due to fact the date range
> query selected fewer records than stated.
>
> The correct stats are:
>
> Lucene 3.0 Stats:
>
> Search conducted using Lucene's Realtime search feature (writer.getReader()
> for each search)
> Analyzer: Russian Analyzer
> Total Docs: 26.04M (emails data - all attachments, body content indexed)
> Index Size: 37G
> Query: body: test AND date: [200901010101 to 201003220225] with descending
> sort on date.
> Lucene Mem Usage: 32 MB (when sorted on date)
> Search Speed: 0.48 (unsorted)
> Search Speed: 0.49 (sorted on YYYYMMDDHHSS date)
>
> Hardware / Software:
>
> Index stored on 4 SAS HDD hitachi RAID 10
> 16G RAM
> 2x Xeon 4 core 2.4Gz
> OS FreeBSD 7.2
> Filesystem UFS2 gjournal
>
> From Yourkit, memory usage is as follows:
>
> Name Number of Objects Shallow Size
> org.apache.lucene.index.TermInfo 370610 14824400
> org.apache.lucene.index.Term 370505 11856160
> com.stimulus.archiva.search.LuceneResult 20000 960000
> org.apache.lucene.search.FieldDoc 10000 320000
> org.apache.lucene.search.ScoreDoc 10000 240000
> org.apache.lucene.index.FieldInfo 1027 41080
> org.apache.lucene.index.SegmentReader$Norm 840 67200
> org.apache.lucene.document.Field 578 36992
> org.apache.lucene.store.NIOFSDirectory$NIOFSIndexInput 415 49800
> org.apache.lucene.index.CompoundFileReader$CSIndexInput 380
> 36480
> org.apache.lucene.util.UnicodeUtil$UTF8Result 320 10240
> org.apache.lucene.index.TermBuffer 315 17640
> org.apache.lucene.util.UnicodeUtil$UTF16Result 315 12600
> org.apache.lucene.index.CompoundFileReader$FileEntry 280 8960
> org.apache.lucene.util.CloseableThreadLocal 260 8320
> org.apache.lucene.index.FreqProxTermsWriter$PostingList 256
> 12288
> org.apache.lucene.index.SegmentInfo 185 19240
> org.apache.lucene.index.SegmentReader$FieldsReaderLocal 140 5600
> org.apache.lucene.index.ReadOnlySegmentReader 105 12600
> org.apache.lucene.index.SegmentReader$Ref 105 2520
> org.apache.lucene.index.SegmentTermEnum 105 11760
> org.apache.lucene.search.FieldCacheImpl$Entry 105 3360
> org.apache.lucene.index.TermInfosReader$ThreadResources 70 2240
> org.apache.lucene.util.cache.SimpleLRUCache 70 1680
> org.apache.lucene.util.cache.SimpleLRUCache$1 70 6160
> org.apache.lucene.index.FieldsReader 50 4400
> org.apache.lucene.index.IndexFileDeleter$RefCount 42 1344
> org.apache.lucene.document.Document 41 1312
> org.apache.lucene.util.SimpleStringInterner$Entry 41 1640
> org.apache.lucene.index.FieldInfos 37 1480
> org.apache.lucene.index.CompoundFileReader 35 2520
> org.apache.lucene.index.SegmentReader 35 4200
> org.apache.lucene.index.SegmentReader$CoreReaders 35 4480
> org.apache.lucene.index.TermInfo[] 35 2963448
> org.apache.lucene.index.TermInfosReader 35 3360
> org.apache.lucene.index.Term[] 35 2963448
> org.apache.lucene.search.FieldCache$CreationPlaceholder 35 840
> org.apache.lucene.store.SimpleFSDirectory$SimpleFSIndexInput$Descriptor
> 35 2240
> org.apache.lucene.index.RawPostingList[] 30 8184
> org.apache.lucene.index.TermsHashPerField 24 3648
> org.apache.lucene.analysis.WhitespaceAnalyzer 16 512
> org.apache.lucene.document.Fieldable[] 13 416
> org.apache.lucene.index.DocFieldProcessorPerField 12 672
> org.apache.lucene.index.DocInverterPerField 12 768
> org.apache.lucene.index.FreqProxTermsWriterPerField 12 864
> org.apache.lucene.index.NormsWriterPerField 12 864
> org.apache.lucene.index.TermVectorsTermsWriterPerField 12 960
> org.apache.lucene.util.AttributeSource$State 12 384
> org.apache.lucene.util.Version 8 256
> org.apache.lucene.index.SegmentInfos 7 616
> org.apache.lucene.document.FieldSelectorResult 6 192
> org.apache.lucene.analysis.CharArraySet$UnmodifiableCharArraySet 5
> 160
> org.apache.lucene.analysis.tokenattributes.OffsetAttributeImpl 5 120
> org.apache.lucene.analysis.tokenattributes.TermAttributeImpl 5 160
> org.apache.lucene.analysis.tokenattributes.PositionIncrementAttributeImpl 4
> 96
> org.apache.lucene.index.BufferedDeletes 4 224
> org.apache.lucene.index.TermsHash 4 320
> org.apache.lucene.analysis.LowerCaseFilter 3 192
> org.apache.lucene.analysis.PerFieldAnalyzerWrapper 3 144
> org.apache.lucene.analysis.SimpleAnalyzer 3 96
> org.apache.lucene.analysis.StopFilter 3 264
> org.apache.lucene.analysis.ru.RussianAnalyzer 3 144
> org.apache.lucene.analysis.ru.RussianAnalyzer$SavedStreams 3 120
> org.apache.lucene.analysis.ru.RussianLetterTokenizer 3 288
> org.apache.lucene.analysis.ru.RussianStemFilter 3 216
> org.apache.lucene.analysis.ru.RussianStemmer 3 96
> org.apache.lucene.index.IndexReader[] 3 912
> org.apache.lucene.index.ReadOnlyDirectoryReader 3 384
> org.apache.lucene.index.SegmentReader[] 3 912
> org.apache.lucene.search.Sort 3 72
> org.apache.lucene.search.SortField 3 168
> org.apache.lucene.search.SortField[] 3 96
> org.apache.lucene.search.TermQuery 3 96
> org.apache.lucene.util.NamedThreadFactory 3 120
>
>
>
>
>
>
>
>
>
> org.apache.lucene.analysis.CharReader 2 80
> org.apache.lucene.index.ByteBlockPool 2 112
> org.apache.lucene.index.ConcurrentMergeScheduler 2 112
> org.apache.lucene.index.DocFieldProcessor 2 96
> org.apache.lucene.index.DocFieldProcessorPerField[] 2 432
> org.apache.lucene.index.DocInverter 2 80
> org.apache.lucene.index.DocumentsWriter 2 608
> org.apache.lucene.index.DocumentsWriter$ByteBlockAllocator 2 64
> org.apache.lucene.index.DocumentsWriter$DocWriter[] 2 208
> org.apache.lucene.index.DocumentsWriter$SkipDocWriter 2 64
> org.apache.lucene.index.DocumentsWriter$WaitQueue 2 112
> org.apache.lucene.index.DocumentsWriterThreadState[] 2 56
> org.apache.lucene.index.FreqProxTermsWriter 2 80
> org.apache.lucene.index.IndexFileDeleter 2 192
> org.apache.lucene.index.IndexFileDeleter$CommitPoint 2 176
> org.apache.lucene.index.IndexWriter 2 592
> org.apache.lucene.index.IndexWriter$MaxFieldLength 2 64
> org.apache.lucene.index.IndexWriter$ReaderPool 2 64
> org.apache.lucene.index.IntBlockPool 2 112
> org.apache.lucene.index.KeepOnlyLastCommitDeletionPolicy 2 32
> org.apache.lucene.index.LogByteSizeMergePolicy 2 112
> org.apache.lucene.index.NormsWriter 2 48
> org.apache.lucene.index.StoredFieldsWriter 2 128
> org.apache.lucene.index.StoredFieldsWriter$PerDoc[] 2 64
> org.apache.lucene.index.TermVectorsTermsWriter 2 176
> org.apache.lucene.index.TermVectorsTermsWriter$PerDoc[] 2 64
> org.apache.lucene.index.TermsHashPerThread 2 176
> org.apache.lucene.queryParser.QueryParser$Operator 2 64
> org.apache.lucene.search.BooleanClause 2 64
> org.apache.lucene.search.NumericRangeQuery 2 160
> org.apache.lucene.search.ParallelMultiSearcher 2 144
> org.apache.lucene.search.QueryWrapperFilter 2 48
> org.apache.lucene.search.ScoreDoc[] 2 48
> org.apache.lucene.search.Searchable[] 2 64
> org.apache.lucene.store.NIOFSDirectory 2 96
> org.apache.lucene.store.NativeFSLock 2 128
> org.apache.lucene.store.NativeFSLockFactory 2 80
> org.apache.lucene.analysis.CharArraySet 1 32
> org.apache.lucene.analysis.LowerCaseTokenizer 1 96
> org.apache.lucene.document.Field$Index$1 1 32
> org.apache.lucene.document.Field$Index$2 1 32
> org.apache.lucene.document.Field$Index$3 1 32
> org.apache.lucene.document.Field$Index$4 1 32
> org.apache.lucene.document.Field$Index$5 1 32
> org.apache.lucene.document.Field$Index[] 1 64
> org.apache.lucene.document.Field$Store$1 1 32
> org.apache.lucene.document.Field$Store$2 1 32
> org.apache.lucene.document.Field$Store[] 1 40
> org.apache.lucene.document.Field$TermVector$1 1 32
> org.apache.lucene.document.Field$TermVector$2 1 32
> org.apache.lucene.document.Field$TermVector$3 1 32
> org.apache.lucene.document.Field$TermVector$4 1 32
> org.apache.lucene.document.Field$TermVector$5 1 32
> org.apache.lucene.document.Field$TermVector[] 1 64
> org.apache.lucene.document.FieldSelectorResult[] 1 72
> org.apache.lucene.document.Field[] 1 24
> org.apache.lucene.index.ByteSliceReader 1 80
> org.apache.lucene.index.CharBlockPool 1 56
> org.apache.lucene.index.DocFieldProcessorPerThread 1 112
> org.apache.lucene.index.DocFieldProcessorPerThread$PerDoc[] 1 32
> org.apache.lucene.index.DocInverterPerThread 1 72
> org.apache.lucene.index.DocInverterPerThread$SingleTokenAttributeSource
> 1 64
> org.apache.lucene.index.DocumentsWriter$1 1 16
> org.apache.lucene.index.DocumentsWriter$DocState 1 72
> org.apache.lucene.index.DocumentsWriterThreadState 1 48
> org.apache.lucene.index.FieldInvertState 1 48
> org.apache.lucene.index.FieldsWriter 1 48
> org.apache.lucene.index.FreqProxTermsWriterPerThread 1 32
> org.apache.lucene.index.IndexFileNameFilter 1 32
> org.apache.lucene.index.NormsWriterPerThread 1 32
> org.apache.lucene.index.ReusableStringReader 1 48
> org.apache.lucene.index.SegmentWriteState 1 72
> org.apache.lucene.index.StoredFieldsWriter$PerDoc 1 56
> org.apache.lucene.index.StoredFieldsWriterPerThread 1 48
> org.apache.lucene.index.TermVectorsTermsWriterPerThread 1 72
> org.apache.lucene.queryParser.QueryParser$Operator[] 1 40
> org.apache.lucene.search.BooleanClause$Occur$1 1 32
> org.apache.lucene.search.BooleanClause$Occur$2 1 32
> org.apache.lucene.search.BooleanClause$Occur$3 1 32
> org.apache.lucene.search.BooleanClause$Occur[] 1 48
> org.apache.lucene.search.BooleanQuery 1 40
> org.apache.lucene.search.DefaultSimilarity 1 24
> org.apache.lucene.search.DocIdSet$1 1 24
> org.apache.lucene.search.DocIdSet$1$1 1 32
> org.apache.lucene.search.FieldCache$1 1 16
> org.apache.lucene.search.FieldCache$10 1 16
> org.apache.lucene.search.FieldCache$2 1 16
> org.apache.lucene.search.FieldCache$3 1 16
> org.apache.lucene.search.FieldCache$4 1 16
> org.apache.lucene.search.FieldCache$5 1 16
> org.apache.lucene.search.FieldCache$6 1 16
> org.apache.lucene.search.FieldCache$7 1 16
> org.apache.lucene.search.FieldCache$8 1 16
> org.apache.lucene.search.FieldCache$9 1 16
> org.apache.lucene.search.FieldCacheImpl 1 32
> org.apache.lucene.search.FieldCacheImpl$ByteCache 1 32
> org.apache.lucene.search.FieldCacheImpl$DoubleCache 1 32
> org.apache.lucene.search.FieldCacheImpl$FloatCache 1 32
> org.apache.lucene.search.FieldCacheImpl$IntCache 1 32
> org.apache.lucene.search.FieldCacheImpl$LongCache 1 32
> org.apache.lucene.search.FieldCacheImpl$ShortCache 1 32
> org.apache.lucene.search.FieldCacheImpl$StringCache 1 32
> org.apache.lucene.search.FieldCacheImpl$StringIndexCache 1 32
> org.apache.lucene.search.MultiTermQuery$1 1 32
> org.apache.lucene.search.MultiTermQuery$ConstantScoreBooleanQueryRewrite
> 1 16
> org.apache.lucene.search.MultiTermQuery$ConstantScoreFilterRewrite 1
> 16
> org.apache.lucene.search.MultiTermQuery$ScoringBooleanQueryRewrite 1
> 16
> org.apache.lucene.search.TopDocs 1 32
> org.apache.lucene.store.RAMFile 1 56
> org.apache.lucene.store.RAMOutputStream 1 72
> org.apache.lucene.util.AttributeSource$AttributeFactory$DefaultAttributeFactory
> 1 16
> org.apache.lucene.util.SimpleStringInterner 1 32
> org.apache.lucene.util.SimpleStringInterner$Entry[] 1 8216
> org.apache.lucene.util.UnicodeUtil$UTF8Result[] 1 40
> org.apache.lucene.util.Version[] 1 88
>
>
> 34562928 <mailto:=@SUM%28C1:C193%29>
>
>
> 32.96
>
>
>
>
>
>
---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org
Re: Lucene 3.0 Search Performance Stats
Posted by Jamie <ja...@stimulussoft.com>.
Hi Everyone
The stats I sent through earlier were erroneous due to fact the date
range query selected fewer records than stated.
The correct stats are:
Lucene 3.0 Stats:
Search conducted using Lucene's Realtime search feature
(writer.getReader() for each search)
Analyzer: Russian Analyzer
Total Docs: 26.04M (emails data - all attachments, body content indexed)
Index Size: 37G
Query: body: test AND date: [200901010101 to 201003220225] with
descending sort on date.
Lucene Mem Usage: 32 MB (when sorted on date)
Search Speed: 0.48 (unsorted)
Search Speed: 0.49 (sorted on YYYYMMDDHHSS date)
Hardware / Software:
Index stored on 4 SAS HDD hitachi RAID 10
16G RAM
2x Xeon 4 core 2.4Gz
OS FreeBSD 7.2
Filesystem UFS2 gjournal
From Yourkit, memory usage is as follows:
Name Number of Objects Shallow Size
org.apache.lucene.index.TermInfo 370610 14824400
org.apache.lucene.index.Term 370505 11856160
com.stimulus.archiva.search.LuceneResult 20000 960000
org.apache.lucene.search.FieldDoc 10000 320000
org.apache.lucene.search.ScoreDoc 10000 240000
org.apache.lucene.index.FieldInfo 1027 41080
org.apache.lucene.index.SegmentReader$Norm 840 67200
org.apache.lucene.document.Field 578 36992
org.apache.lucene.store.NIOFSDirectory$NIOFSIndexInput 415 49800
org.apache.lucene.index.CompoundFileReader$CSIndexInput 380 36480
org.apache.lucene.util.UnicodeUtil$UTF8Result 320 10240
org.apache.lucene.index.TermBuffer 315 17640
org.apache.lucene.util.UnicodeUtil$UTF16Result 315 12600
org.apache.lucene.index.CompoundFileReader$FileEntry 280 8960
org.apache.lucene.util.CloseableThreadLocal 260 8320
org.apache.lucene.index.FreqProxTermsWriter$PostingList 256 12288
org.apache.lucene.index.SegmentInfo 185 19240
org.apache.lucene.index.SegmentReader$FieldsReaderLocal 140 5600
org.apache.lucene.index.ReadOnlySegmentReader 105 12600
org.apache.lucene.index.SegmentReader$Ref 105 2520
org.apache.lucene.index.SegmentTermEnum 105 11760
org.apache.lucene.search.FieldCacheImpl$Entry 105 3360
org.apache.lucene.index.TermInfosReader$ThreadResources 70 2240
org.apache.lucene.util.cache.SimpleLRUCache 70 1680
org.apache.lucene.util.cache.SimpleLRUCache$1 70 6160
org.apache.lucene.index.FieldsReader 50 4400
org.apache.lucene.index.IndexFileDeleter$RefCount 42 1344
org.apache.lucene.document.Document 41 1312
org.apache.lucene.util.SimpleStringInterner$Entry 41 1640
org.apache.lucene.index.FieldInfos 37 1480
org.apache.lucene.index.CompoundFileReader 35 2520
org.apache.lucene.index.SegmentReader 35 4200
org.apache.lucene.index.SegmentReader$CoreReaders 35 4480
org.apache.lucene.index.TermInfo[] 35 2963448
org.apache.lucene.index.TermInfosReader 35 3360
org.apache.lucene.index.Term[] 35 2963448
org.apache.lucene.search.FieldCache$CreationPlaceholder 35 840
org.apache.lucene.store.SimpleFSDirectory$SimpleFSIndexInput$Descriptor 35
2240
org.apache.lucene.index.RawPostingList[] 30 8184
org.apache.lucene.index.TermsHashPerField 24 3648
org.apache.lucene.analysis.WhitespaceAnalyzer 16 512
org.apache.lucene.document.Fieldable[] 13 416
org.apache.lucene.index.DocFieldProcessorPerField 12 672
org.apache.lucene.index.DocInverterPerField 12 768
org.apache.lucene.index.FreqProxTermsWriterPerField 12 864
org.apache.lucene.index.NormsWriterPerField 12 864
org.apache.lucene.index.TermVectorsTermsWriterPerField 12 960
org.apache.lucene.util.AttributeSource$State 12 384
org.apache.lucene.util.Version 8 256
org.apache.lucene.index.SegmentInfos 7 616
org.apache.lucene.document.FieldSelectorResult 6 192
org.apache.lucene.analysis.CharArraySet$UnmodifiableCharArraySet 5 160
org.apache.lucene.analysis.tokenattributes.OffsetAttributeImpl 5 120
org.apache.lucene.analysis.tokenattributes.TermAttributeImpl 5 160
org.apache.lucene.analysis.tokenattributes.PositionIncrementAttributeImpl
4 96
org.apache.lucene.index.BufferedDeletes 4 224
org.apache.lucene.index.TermsHash 4 320
org.apache.lucene.analysis.LowerCaseFilter 3 192
org.apache.lucene.analysis.PerFieldAnalyzerWrapper 3 144
org.apache.lucene.analysis.SimpleAnalyzer 3 96
org.apache.lucene.analysis.StopFilter 3 264
org.apache.lucene.analysis.ru.RussianAnalyzer 3 144
org.apache.lucene.analysis.ru.RussianAnalyzer$SavedStreams 3 120
org.apache.lucene.analysis.ru.RussianLetterTokenizer 3 288
org.apache.lucene.analysis.ru.RussianStemFilter 3 216
org.apache.lucene.analysis.ru.RussianStemmer 3 96
org.apache.lucene.index.IndexReader[] 3 912
org.apache.lucene.index.ReadOnlyDirectoryReader 3 384
org.apache.lucene.index.SegmentReader[] 3 912
org.apache.lucene.search.Sort 3 72
org.apache.lucene.search.SortField 3 168
org.apache.lucene.search.SortField[] 3 96
org.apache.lucene.search.TermQuery 3 96
org.apache.lucene.util.NamedThreadFactory 3 120
org.apache.lucene.analysis.CharReader 2 80
org.apache.lucene.index.ByteBlockPool 2 112
org.apache.lucene.index.ConcurrentMergeScheduler 2 112
org.apache.lucene.index.DocFieldProcessor 2 96
org.apache.lucene.index.DocFieldProcessorPerField[] 2 432
org.apache.lucene.index.DocInverter 2 80
org.apache.lucene.index.DocumentsWriter 2 608
org.apache.lucene.index.DocumentsWriter$ByteBlockAllocator 2 64
org.apache.lucene.index.DocumentsWriter$DocWriter[] 2 208
org.apache.lucene.index.DocumentsWriter$SkipDocWriter 2 64
org.apache.lucene.index.DocumentsWriter$WaitQueue 2 112
org.apache.lucene.index.DocumentsWriterThreadState[] 2 56
org.apache.lucene.index.FreqProxTermsWriter 2 80
org.apache.lucene.index.IndexFileDeleter 2 192
org.apache.lucene.index.IndexFileDeleter$CommitPoint 2 176
org.apache.lucene.index.IndexWriter 2 592
org.apache.lucene.index.IndexWriter$MaxFieldLength 2 64
org.apache.lucene.index.IndexWriter$ReaderPool 2 64
org.apache.lucene.index.IntBlockPool 2 112
org.apache.lucene.index.KeepOnlyLastCommitDeletionPolicy 2 32
org.apache.lucene.index.LogByteSizeMergePolicy 2 112
org.apache.lucene.index.NormsWriter 2 48
org.apache.lucene.index.StoredFieldsWriter 2 128
org.apache.lucene.index.StoredFieldsWriter$PerDoc[] 2 64
org.apache.lucene.index.TermVectorsTermsWriter 2 176
org.apache.lucene.index.TermVectorsTermsWriter$PerDoc[] 2 64
org.apache.lucene.index.TermsHashPerThread 2 176
org.apache.lucene.queryParser.QueryParser$Operator 2 64
org.apache.lucene.search.BooleanClause 2 64
org.apache.lucene.search.NumericRangeQuery 2 160
org.apache.lucene.search.ParallelMultiSearcher 2 144
org.apache.lucene.search.QueryWrapperFilter 2 48
org.apache.lucene.search.ScoreDoc[] 2 48
org.apache.lucene.search.Searchable[] 2 64
org.apache.lucene.store.NIOFSDirectory 2 96
org.apache.lucene.store.NativeFSLock 2 128
org.apache.lucene.store.NativeFSLockFactory 2 80
org.apache.lucene.analysis.CharArraySet 1 32
org.apache.lucene.analysis.LowerCaseTokenizer 1 96
org.apache.lucene.document.Field$Index$1 1 32
org.apache.lucene.document.Field$Index$2 1 32
org.apache.lucene.document.Field$Index$3 1 32
org.apache.lucene.document.Field$Index$4 1 32
org.apache.lucene.document.Field$Index$5 1 32
org.apache.lucene.document.Field$Index[] 1 64
org.apache.lucene.document.Field$Store$1 1 32
org.apache.lucene.document.Field$Store$2 1 32
org.apache.lucene.document.Field$Store[] 1 40
org.apache.lucene.document.Field$TermVector$1 1 32
org.apache.lucene.document.Field$TermVector$2 1 32
org.apache.lucene.document.Field$TermVector$3 1 32
org.apache.lucene.document.Field$TermVector$4 1 32
org.apache.lucene.document.Field$TermVector$5 1 32
org.apache.lucene.document.Field$TermVector[] 1 64
org.apache.lucene.document.FieldSelectorResult[] 1 72
org.apache.lucene.document.Field[] 1 24
org.apache.lucene.index.ByteSliceReader 1 80
org.apache.lucene.index.CharBlockPool 1 56
org.apache.lucene.index.DocFieldProcessorPerThread 1 112
org.apache.lucene.index.DocFieldProcessorPerThread$PerDoc[] 1 32
org.apache.lucene.index.DocInverterPerThread 1 72
org.apache.lucene.index.DocInverterPerThread$SingleTokenAttributeSource 1
64
org.apache.lucene.index.DocumentsWriter$1 1 16
org.apache.lucene.index.DocumentsWriter$DocState 1 72
org.apache.lucene.index.DocumentsWriterThreadState 1 48
org.apache.lucene.index.FieldInvertState 1 48
org.apache.lucene.index.FieldsWriter 1 48
org.apache.lucene.index.FreqProxTermsWriterPerThread 1 32
org.apache.lucene.index.IndexFileNameFilter 1 32
org.apache.lucene.index.NormsWriterPerThread 1 32
org.apache.lucene.index.ReusableStringReader 1 48
org.apache.lucene.index.SegmentWriteState 1 72
org.apache.lucene.index.StoredFieldsWriter$PerDoc 1 56
org.apache.lucene.index.StoredFieldsWriterPerThread 1 48
org.apache.lucene.index.TermVectorsTermsWriterPerThread 1 72
org.apache.lucene.queryParser.QueryParser$Operator[] 1 40
org.apache.lucene.search.BooleanClause$Occur$1 1 32
org.apache.lucene.search.BooleanClause$Occur$2 1 32
org.apache.lucene.search.BooleanClause$Occur$3 1 32
org.apache.lucene.search.BooleanClause$Occur[] 1 48
org.apache.lucene.search.BooleanQuery 1 40
org.apache.lucene.search.DefaultSimilarity 1 24
org.apache.lucene.search.DocIdSet$1 1 24
org.apache.lucene.search.DocIdSet$1$1 1 32
org.apache.lucene.search.FieldCache$1 1 16
org.apache.lucene.search.FieldCache$10 1 16
org.apache.lucene.search.FieldCache$2 1 16
org.apache.lucene.search.FieldCache$3 1 16
org.apache.lucene.search.FieldCache$4 1 16
org.apache.lucene.search.FieldCache$5 1 16
org.apache.lucene.search.FieldCache$6 1 16
org.apache.lucene.search.FieldCache$7 1 16
org.apache.lucene.search.FieldCache$8 1 16
org.apache.lucene.search.FieldCache$9 1 16
org.apache.lucene.search.FieldCacheImpl 1 32
org.apache.lucene.search.FieldCacheImpl$ByteCache 1 32
org.apache.lucene.search.FieldCacheImpl$DoubleCache 1 32
org.apache.lucene.search.FieldCacheImpl$FloatCache 1 32
org.apache.lucene.search.FieldCacheImpl$IntCache 1 32
org.apache.lucene.search.FieldCacheImpl$LongCache 1 32
org.apache.lucene.search.FieldCacheImpl$ShortCache 1 32
org.apache.lucene.search.FieldCacheImpl$StringCache 1 32
org.apache.lucene.search.FieldCacheImpl$StringIndexCache 1 32
org.apache.lucene.search.MultiTermQuery$1 1 32
org.apache.lucene.search.MultiTermQuery$ConstantScoreBooleanQueryRewrite 1
16
org.apache.lucene.search.MultiTermQuery$ConstantScoreFilterRewrite 1 16
org.apache.lucene.search.MultiTermQuery$ScoringBooleanQueryRewrite 1 16
org.apache.lucene.search.TopDocs 1 32
org.apache.lucene.store.RAMFile 1 56
org.apache.lucene.store.RAMOutputStream 1 72
org.apache.lucene.util.AttributeSource$AttributeFactory$DefaultAttributeFactory
1 16
org.apache.lucene.util.SimpleStringInterner 1 32
org.apache.lucene.util.SimpleStringInterner$Entry[] 1 8216
org.apache.lucene.util.UnicodeUtil$UTF8Result[] 1 40
org.apache.lucene.util.Version[] 1 88
34562928 <mailto:=@SUM%28C1:C193%29>
32.96
RE: Lucene 3.0 Search Performance Stats
Posted by Uwe Schindler <uw...@thetaphi.de>.
Hi Jamie,
thanks for reporting back the numbers about your usage of NumericField and NumericRangeQuery! I am glad to hear about it.
> Sure. As soon as I get access to the server again, I'll get the mem
> stats for you. I will say that Lucene was consuming a large amount of
> memory before we moved over to using Numerics. The reason for this is
> that we were encoding dates as strings. Our date time strings were
> unique, so as the number of records exploded, so too did the number of
> terms in the index. As I understand it (and I am no Lucene expert),
> Lucene's sorting mechanisms need to load up all the terms in the index
> in memory during a sort. Thus, if you execute a sorted search, Lucene's
> memory consumption goes through the roof. Using Numerics avoids this
> problem.
Thats true, as you only need to store an integer per document in the cache (and not a String). Also performance for FieldCache warmup and sorting is higher.
> There are lot of other strategies we used to reduce memory consumption.
> Like, making sure that you are caching Searchers and IndexReaders, etc.
>
> Regards,
>
> Jamie
>
>
>
> On 2010/03/19 07:04 PM, Monique Monteiro wrote:
> > Hi Jamie,
> >
> > could you please tell us how much memory does your application
> consume
> > with Lucene? I'm asking it because we are having memory consumption
> problems
> > with a 32GB index and 1.5GB od RAM allocated to our web application.
> At the
> > momento, we use textual search.
> > Thanks in advance,
> > Monique
> > On Fri, Mar 19, 2010 at 8:49 AM, Jamie<ja...@stimulussoft.com>
> wrote:
> >
> >
> >> Hi Guys
> >>
> >> I just wanted to congratulate the Lucene guys for a fine job on
> 3.0!!
> >>
> >> Since we switched our indexes to using integer based range queries
> based on
> >> Date (YYMMHHSS), search speed is lightening fast and memory
> consumption has
> >> dropped considerably!
> >>
> >> Some stats:
> >>
> >> Indexed Docs: 7.2M emails
> >> Index Size: 24 GB (non optimized)
> >> Search Speed: 0.06 - 0.09 seconds (with sort YYMMHHSS date)
> >>
> >> Index stored on 4 SAS HDD hitachi RAID 10
> >> 16G RAM
> >> 2x Xeon 4 core 2.4Gz
> >> OS FreeBSD 7.2
> >> Filesystem UFS2 gjournal
> >>
> >> I believe we are using all search performance recommendations now.
> >>
> >> Good job!
> >>
> >> Jamie
> >>
> >>
> >>
> >> --------------------------------------------------------------------
> -
> >> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> >> For additional commands, e-mail: java-user-help@lucene.apache.org
> >>
> >>
> >>
> >
> >
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org
Re: Lucene 3.0 Search Performance Stats
Posted by Jamie <ja...@stimulussoft.com>.
Hi Monique
Sure. As soon as I get access to the server again, I'll get the mem
stats for you. I will say that Lucene was consuming a large amount of
memory before we moved over to using Numerics. The reason for this is
that we were encoding dates as strings. Our date time strings were
unique, so as the number of records exploded, so too did the number of
terms in the index. As I understand it (and I am no Lucene expert),
Lucene's sorting mechanisms need to load up all the terms in the index
in memory during a sort. Thus, if you execute a sorted search, Lucene's
memory consumption goes through the roof. Using Numerics avoids this
problem.
There are lot of other strategies we used to reduce memory consumption.
Like, making sure that you are caching Searchers and IndexReaders, etc.
Regards,
Jamie
On 2010/03/19 07:04 PM, Monique Monteiro wrote:
> Hi Jamie,
>
> could you please tell us how much memory does your application consume
> with Lucene? I'm asking it because we are having memory consumption problems
> with a 32GB index and 1.5GB od RAM allocated to our web application. At the
> momento, we use textual search.
> Thanks in advance,
> Monique
> On Fri, Mar 19, 2010 at 8:49 AM, Jamie<ja...@stimulussoft.com> wrote:
>
>
>> Hi Guys
>>
>> I just wanted to congratulate the Lucene guys for a fine job on 3.0!!
>>
>> Since we switched our indexes to using integer based range queries based on
>> Date (YYMMHHSS), search speed is lightening fast and memory consumption has
>> dropped considerably!
>>
>> Some stats:
>>
>> Indexed Docs: 7.2M emails
>> Index Size: 24 GB (non optimized)
>> Search Speed: 0.06 - 0.09 seconds (with sort YYMMHHSS date)
>>
>> Index stored on 4 SAS HDD hitachi RAID 10
>> 16G RAM
>> 2x Xeon 4 core 2.4Gz
>> OS FreeBSD 7.2
>> Filesystem UFS2 gjournal
>>
>> I believe we are using all search performance recommendations now.
>>
>> Good job!
>>
>> Jamie
>>
>>
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
>> For additional commands, e-mail: java-user-help@lucene.apache.org
>>
>>
>>
>
>
---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org
Re: Lucene 3.0 Search Performance Stats
Posted by Monique Monteiro <mo...@gmail.com>.
Hi Jamie,
could you please tell us how much memory does your application consume
with Lucene? I'm asking it because we are having memory consumption problems
with a 32GB index and 1.5GB od RAM allocated to our web application. At the
momento, we use textual search.
Thanks in advance,
Monique
On Fri, Mar 19, 2010 at 8:49 AM, Jamie <ja...@stimulussoft.com> wrote:
> Hi Guys
>
> I just wanted to congratulate the Lucene guys for a fine job on 3.0!!
>
> Since we switched our indexes to using integer based range queries based on
> Date (YYMMHHSS), search speed is lightening fast and memory consumption has
> dropped considerably!
>
> Some stats:
>
> Indexed Docs: 7.2M emails
> Index Size: 24 GB (non optimized)
> Search Speed: 0.06 - 0.09 seconds (with sort YYMMHHSS date)
>
> Index stored on 4 SAS HDD hitachi RAID 10
> 16G RAM
> 2x Xeon 4 core 2.4Gz
> OS FreeBSD 7.2
> Filesystem UFS2 gjournal
>
> I believe we are using all search performance recommendations now.
>
> Good job!
>
> Jamie
>
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
>
>
--
Monique Monteiro, MSc
Auditora Federal de Controle Externo - Tribunal de Contas da União (TCU)
IBM OOAD / SCJP / MCTS Web
Blog: http://moniquelouise.spaces.live.com/
Twitter: http://twitter.com/monilouise
MSN: monique_louise@msn.com
GTalk: monique.louise@gmail.com
Re: Lucene 3.0 Search Performance Stats
Posted by Michael McCandless <lu...@mikemccandless.com>.
Very nice! Thanks for sharing :)
Mike
On Fri, Mar 19, 2010 at 6:53 AM, Jamie <ja...@stimulussoft.com> wrote:
> I forgot to point out, this is a search using the Lucene realtime search
> feature. We get the reader from indexwriter.getReader() for each search.
>
> On 2010/03/19 01:49 PM, Jamie wrote:
>>
>> Hi Guys
>>
>> I just wanted to congratulate the Lucene guys for a fine job on 3.0!!
>>
>> Since we switched our indexes to using integer based range queries based
>> on Date (YYMMHHSS), search speed is lightening fast and memory consumption
>> has dropped considerably!
>>
>> Some stats:
>>
>> Indexed Docs: 7.2M emails
>> Index Size: 24 GB (non optimized)
>> Search Speed: 0.06 - 0.09 seconds (with sort YYMMHHSS date)
>>
>> Index stored on 4 SAS HDD hitachi RAID 10
>> 16G RAM
>> 2x Xeon 4 core 2.4Gz
>> OS FreeBSD 7.2
>> Filesystem UFS2 gjournal
>>
>> I believe we are using all search performance recommendations now.
>>
>> Good job!
>>
>> Jamie
>>
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
>> For additional commands, e-mail: java-user-help@lucene.apache.org
>>
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
>
>
---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org
Re: Lucene 3.0 Search Performance Stats
Posted by Jamie <ja...@stimulussoft.com>.
I forgot to point out, this is a search using the Lucene realtime search
feature. We get the reader from indexwriter.getReader() for each search.
On 2010/03/19 01:49 PM, Jamie wrote:
> Hi Guys
>
> I just wanted to congratulate the Lucene guys for a fine job on 3.0!!
>
> Since we switched our indexes to using integer based range queries
> based on Date (YYMMHHSS), search speed is lightening fast and memory
> consumption has dropped considerably!
>
> Some stats:
>
> Indexed Docs: 7.2M emails
> Index Size: 24 GB (non optimized)
> Search Speed: 0.06 - 0.09 seconds (with sort YYMMHHSS date)
>
> Index stored on 4 SAS HDD hitachi RAID 10
> 16G RAM
> 2x Xeon 4 core 2.4Gz
> OS FreeBSD 7.2
> Filesystem UFS2 gjournal
>
> I believe we are using all search performance recommendations now.
>
> Good job!
>
> Jamie
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
>
---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org