You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@lucene.apache.org by Bernhard Messer <Be...@intrafind.de> on 2004/08/10 00:03:19 UTC

TermVectorsReader performance

hi all,

i just made a test case to measure the TermVectorsReader performance 
when running one IndexReader in several threads. To do this, i'm adding 
1000 documents with one field and different term for each in a 
RAMDirectory. Then starting up 1 to 10 threads with the same instance of 
IndexReader and calling getTermVectors(docId) 100 times within each thread.

As the cpu time profiler shows, most of the time are spent (81%), 
waiting for methods to finish. Looking at the TermVectorsReader 
implementation, nearly all methods are synchronized.

Before i start to break my head and dig into that nightmare of 
synchronization: Is there somebody out there who started with some 
cleanups on TermVectorsReader, or just thought about it ?

My first idea was to make the termVector Object ThreadLocal in 
SegmentReader, but i think this wouldn't work because IndexReaders are 
using there own thread (which is quite clever and should not be a 
candidate for a change).

CPU TIME (ms) BEGIN (total = 100) Mon Aug  9 23:43:03 2004
rank   self  accum   count trace method
   1 32.00% 32.00%    1553    58 java.lang.Thread.sleep
   2 25.00% 57.00%      75    60 java.lang.Object.wait
   3 24.00% 81.00%      77    59 java.lang.Object.wait
   4  4.00% 85.00% 1229000    56 
org.apache.lucene.store.InputStream.readVInt
   5  4.00% 89.00%  100000    51 
org.apache.lucene.index.TermVectorsReader.readTermVector
   6  3.00% 92.00% 1745300    54 
org.apache.lucene.store.InputStream.readByte
   7  2.00% 94.00%  343000    49 
org.apache.lucene.store.InputStream.readChars
   8  2.00% 96.00% 1229000    53 
org.apache.lucene.store.InputStream.readByte
   9  1.00% 97.00%  200000    55 org.apache.lucene.store.InputStream.readInt
  10  1.00% 98.00%  800000    52 
org.apache.lucene.store.InputStream.readByte
  11  1.00% 99.00%  343000    57 java.lang.String.<init>
  12  1.00% 100.00%  100000    50 
org.apache.lucene.index.TermVectorsReader.get
CPU TIME (ms) END

thx
Bernhard


---------------------------------------------------------------------
To unsubscribe, e-mail: lucene-dev-unsubscribe@jakarta.apache.org
For additional commands, e-mail: lucene-dev-help@jakarta.apache.org