You are viewing a plain text version of this content. The canonical link for it is here.
Posted to java-user@lucene.apache.org by Chew Yee Chuang <ye...@tecforte.com> on 2007/08/01 04:29:30 UTC

RE: High CPU usage duing index and search

Hi,

Thanks for the link provided, actually I've go through those article when I
developing the index and search function for my application. I haven’t try
profiler yet, but I monitor the CPU usage and notice that whatever index or
search performing, the CPU usage raise to 100%. Below I will try to
elaborate more on what my application is doing and how I index and search.

There are many concurrent process running, first, the application will write
records that received into a text file with tab separated each different
field. Application will point to a new file every 10mins and start writing
to it. So every file will contains only 10mins record, approximate 600,000
records per file. Then, the indexing process will check whether there is a
text file to be index, if it is, the thread will wake up and start perform
indexing.
 
The indexing process will first add documents to RAMDir, Then later, add
RAMDir into FSDir by calling addIndexNoOptimize() when there is 100,000
documents(32 fields per doc) in RAMDir. There is only 1 IndexWriter(FSDir)
was created but a few IndexWriter(RAMDir) was created during the whole
process. Below are some configuration for IndexWriters that I mentioned:-

IndexWriter (RAMDir)
- SimpleAnalyzer
- setMaxBufferedDocs(10000)
- Filed.Store.YES
- Field.Index.NO_NORMS

IndexWriter (FSDir)
- SimpleAnalyzer
- setMergeFactor(20)
- addIndexesNoOptimize()

For the searching, because there are many queries(20,000) run continuously
to generate the aggregate table for reporting purpose. All this queries is
run in nested loop, and there is only 1 Searcher created, I try searcher and
filter as well, filter give me better result, but both also utilize lots of
CPU resources.

Hope this info will help and sorry for my bad English.

Thanks
eChuang, Chew

-----Original Message-----
From: karl wettin [mailto:karl.wettin@gmail.com] 
Sent: Tuesday, July 31, 2007 5:54 PM
To: java-user@lucene.apache.org
Subject: Re: High CPU usage duing index and search


31 jul 2007 kl. 05.25 skrev Chew Yee Chuang:
> But just notice that when Lucene performing search or index,
> the CPU usage on my machine raise to 100%, because of this issue,  
> some of my
> others backend process will slow down eventually. Just want to know  
> does
> anyone face this problem before ? and is it any idea on how to  
> overcome this
> problem ?

Did you run a profiler to see what it is that consume all the resources?
It is very hard to guess based on the information you supplied. Start  
here:

http://wiki.apache.org/lucene-java/BasicsOfPerformance
http://wiki.apache.org/lucene-java/ImproveIndexingSpeed
http://wiki.apache.org/lucene-java/ImproveSearchingSpeed


-- 
karl

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


No virus found in this incoming message.
Checked by AVG Free Edition. 
Version: 7.5.476 / Virus Database: 269.11.0/927 - Release Date: 7/30/2007
5:02 PM
 

No virus found in this outgoing message.
Checked by AVG Free Edition. 
Version: 7.5.476 / Virus Database: 269.11.0/929 - Release Date: 7/31/2007
5:26 PM
 



---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


RE: High CPU usage duing index and search

Posted by testn <te...@doramail.com>.
20,000 queries continuously? Sounds a bit too much. Can you elaborate more
what you need to do? Probably you won't need that many queries.



Chew Yee Chuang wrote:
> 
> Hi,
> 
> Thanks for the link provided, actually I've go through those article when
> I
> developing the index and search function for my application. I haven’t try
> profiler yet, but I monitor the CPU usage and notice that whatever index
> or
> search performing, the CPU usage raise to 100%. Below I will try to
> elaborate more on what my application is doing and how I index and search.
> 
> There are many concurrent process running, first, the application will
> write
> records that received into a text file with tab separated each different
> field. Application will point to a new file every 10mins and start writing
> to it. So every file will contains only 10mins record, approximate 600,000
> records per file. Then, the indexing process will check whether there is a
> text file to be index, if it is, the thread will wake up and start perform
> indexing.
>  
> The indexing process will first add documents to RAMDir, Then later, add
> RAMDir into FSDir by calling addIndexNoOptimize() when there is 100,000
> documents(32 fields per doc) in RAMDir. There is only 1 IndexWriter(FSDir)
> was created but a few IndexWriter(RAMDir) was created during the whole
> process. Below are some configuration for IndexWriters that I mentioned:-
> 
> IndexWriter (RAMDir)
> - SimpleAnalyzer
> - setMaxBufferedDocs(10000)
> - Filed.Store.YES
> - Field.Index.NO_NORMS
> 
> IndexWriter (FSDir)
> - SimpleAnalyzer
> - setMergeFactor(20)
> - addIndexesNoOptimize()
> 
> For the searching, because there are many queries(20,000) run continuously
> to generate the aggregate table for reporting purpose. All this queries is
> run in nested loop, and there is only 1 Searcher created, I try searcher
> and
> filter as well, filter give me better result, but both also utilize lots
> of
> CPU resources.
> 
> Hope this info will help and sorry for my bad English.
> 
> Thanks
> eChuang, Chew
> 
> -----Original Message-----
> From: karl wettin [mailto:karl.wettin@gmail.com] 
> Sent: Tuesday, July 31, 2007 5:54 PM
> To: java-user@lucene.apache.org
> Subject: Re: High CPU usage duing index and search
> 
> 
> 31 jul 2007 kl. 05.25 skrev Chew Yee Chuang:
>> But just notice that when Lucene performing search or index,
>> the CPU usage on my machine raise to 100%, because of this issue,  
>> some of my
>> others backend process will slow down eventually. Just want to know  
>> does
>> anyone face this problem before ? and is it any idea on how to  
>> overcome this
>> problem ?
> 
> Did you run a profiler to see what it is that consume all the resources?
> It is very hard to guess based on the information you supplied. Start  
> here:
> 
> http://wiki.apache.org/lucene-java/BasicsOfPerformance
> http://wiki.apache.org/lucene-java/ImproveIndexingSpeed
> http://wiki.apache.org/lucene-java/ImproveSearchingSpeed
> 
> 
> -- 
> karl
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
> 
> 
> No virus found in this incoming message.
> Checked by AVG Free Edition. 
> Version: 7.5.476 / Virus Database: 269.11.0/927 - Release Date: 7/30/2007
> 5:02 PM
>  
> 
> No virus found in this outgoing message.
> Checked by AVG Free Edition. 
> Version: 7.5.476 / Virus Database: 269.11.0/929 - Release Date: 7/31/2007
> 5:26 PM
>  
> 
> 
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
> 
> 
> 

-- 
View this message in context: http://www.nabble.com/High-CPU-usage-duing-index-and-search-tf4190756.html#a11962524
Sent from the Lucene - Java Users mailing list archive at Nabble.com.


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


AW: High CPU usage duing index and search

Posted by "Steinert, Fabian" <Fa...@fokus.fraunhofer.de>.
Hi Chew,
 
with Lucene you could try the following:
 
Make one query for each single value in each category (each Term):
1Q - Gender:M
2Q - Department:Accounting
3Q - Department:R&D
4Q - ...
 
with a custom HitCollector like the following example taken from org.apache.lucene.search.HitCollector API Spec:
 
   Searcher searcher = new IndexSearcher(indexReader);
   final BitSet bits = new BitSet(indexReader.maxDoc());
   searcher.search(query, new HitCollector() {
       public void collect(int doc, float score) {
         bits.set(doc);
       }
     });
 
Thus you'l get one BitSet for each Term with bits set at DocIDs containing the Term.
 
Then do the combinatory part on these BitSets like:
 
  BitSet FemaleAndRnD = ((BitSet) RnD.clone()).andNot(Male);

Cheers,
Fabian
 

________________________________

Von: Chew Yee Chuang [mailto:yeechuang@tecforte.com]
Gesendet: Mi 15.08.2007 10:31
An: java-user@lucene.apache.org
Betreff: RE: High CPU usage duing index and search



Greetings,

I have tested with Mysql, the grouping is ok when there is not much records in the table, but when I come across to performed grouping in a table which have 3 millions of records, It really take a very long time to finish. Thus, Im looking at lucene and hope it can help.

Thank you
eChuang, Chew




RE: High CPU usage duing index and search

Posted by Chew Yee Chuang <ye...@tecforte.com>.
Greetings,

I have tested with Mysql, the grouping is ok when there is not much records in the table, but when I come across to performed grouping in a table which have 3 millions of records, It really take a very long time to finish. Thus, Im looking at lucene and hope it can help.

Thank you
eChuang, Chew

-----Original Message-----
From: testn [mailto:test1@doramail.com] 
Sent: Monday, August 13, 2007 4:34 PM
To: java-user@lucene.apache.org
Subject: RE: High CPU usage duing index and search


To me, it looks like what you are trying to achieve is more suitable to be
the database where it can help you do grouping and sorting, etc. But if you
still want to achieve it using lucene, you might want to post some code so
that I can go through it and see why it uses so much resource then.


Chew Yee Chuang wrote:
> 
> Hi testn,
> 
> I have tested Filter, it is pretty fast, but still take a lot of CPU
> resource, Maybe it could due to the number of filter I run.
> 
> Thank you
> eChuang, Chew
> 
> 
> -----Original Message-----
> From: testn [mailto:test1@doramail.com] 
> Sent: Tuesday, August 07, 2007 10:37 PM
> To: java-user@lucene.apache.org
> Subject: RE: High CPU usage duing index and search
> 
> 
> Check out Filter class. You can create a separate filter for each field
> and
> then chain them together using ChainFilter. If you cache the filter, it
> will
> be pretty fast. 
> 
> 
> Chew Yee Chuang wrote:
>> 
>> Greetings,
>> 
>> Yes, process a little bit and stop for a while really reduce the CPU
>> usage,
>> but I need to find out a balance so that the indexing or searching will
>> not
>> have so much delay.
>> 
>> Execute 20,000 queries at a time is because the process is generating the
>> aggregation data for reporting,
>> E.g Gender (M,F), Department (Accounting, R&D, Financial,...etc), 
>> 1Q - Gender:M AND Department: Accounting
>> 2Q - Gender:M AND Department: R&D
>> 3Q - Gender:M AND Department: Financial
>> 4Q - Gender:F AND Department: Accounting
>> 5Q - ....
>> Thus, the more combination, the more query need to run. For now, I still
>> can't get any idea on how to reduce it, just thinking maybe there is a
>> different way to index it so that I can get It easily.
>> 
>> Any help would be appreciated.
>> 
>> Thanks
>> eChuang, Chew
>> 
>> -----Original Message-----
>> From: karl wettin [mailto:karl.wettin@gmail.com] 
>> Sent: Thursday, August 02, 2007 7:11 AM
>> To: java-user@lucene.apache.org
>> Subject: Re: High CPU usage duing index and search
>> 
>> It sounds like you have a fairly busy system, perhaps 100% load on the
>> process is not that strange, at least not during short periods of time.
>> 
>> A simpler solution would be to nice the process a little bit in order to
>> give your background jobs some more time to think.
>> 
>> Running a profiler is still the best advice I can think of. It should
>> clearly show you what is going on when you run out of CPU.
>> 
>> --  
>> karl
>> 
>> 1 aug 2007 kl. 04.29 skrev Chew Yee Chuang:
>> 
>>> Hi,
>>>
>>> Thanks for the link provided, actually I've go through those  
>>> article when I
>>> developing the index and search function for my application. I  
>>> haven’t try
>>> profiler yet, but I monitor the CPU usage and notice that whatever  
>>> index or
>>> search performing, the CPU usage raise to 100%. Below I will try to
>>> elaborate more on what my application is doing and how I index and  
>>> search.
>>>
>>> There are many concurrent process running, first, the application  
>>> will write
>>> records that received into a text file with tab separated each  
>>> different
>>> field. Application will point to a new file every 10mins and start  
>>> writing
>>> to it. So every file will contains only 10mins record, approximate  
>>> 600,000
>>> records per file. Then, the indexing process will check whether  
>>> there is a
>>> text file to be index, if it is, the thread will wake up and start  
>>> perform
>>> indexing.
>>>
>>> The indexing process will first add documents to RAMDir, Then  
>>> later, add
>>> RAMDir into FSDir by calling addIndexNoOptimize() when there is  
>>> 100,000
>>> documents(32 fields per doc) in RAMDir. There is only 1 IndexWriter 
>>> (FSDir)
>>> was created but a few IndexWriter(RAMDir) was created during the whole
>>> process. Below are some configuration for IndexWriters that I  
>>> mentioned:-
>>>
>>> IndexWriter (RAMDir)
>>> - SimpleAnalyzer
>>> - setMaxBufferedDocs(10000)
>>> - Filed.Store.YES
>>> - Field.Index.NO_NORMS
>>>
>>> IndexWriter (FSDir)
>>> - SimpleAnalyzer
>>> - setMergeFactor(20)
>>> - addIndexesNoOptimize()
>>>
>>> For the searching, because there are many queries(20,000) run  
>>> continuously
>>> to generate the aggregate table for reporting purpose. All this  
>>> queries is
>>> run in nested loop, and there is only 1 Searcher created, I try  
>>> searcher and
>>> filter as well, filter give me better result, but both also utilize  
>>> lots of
>>> CPU resources.
>>>
>>> Hope this info will help and sorry for my bad English.
>>>
>>> Thanks
>>> eChuang, Chew
>>>
>>> -----Original Message-----
>>> From: karl wettin [mailto:karl.wettin@gmail.com]
>>> Sent: Tuesday, July 31, 2007 5:54 PM
>>> To: java-user@lucene.apache.org
>>> Subject: Re: High CPU usage duing index and search
>>>
>>>
>>> 31 jul 2007 kl. 05.25 skrev Chew Yee Chuang:
>>>> But just notice that when Lucene performing search or index,
>>>> the CPU usage on my machine raise to 100%, because of this issue,
>>>> some of my
>>>> others backend process will slow down eventually. Just want to know
>>>> does
>>>> anyone face this problem before ? and is it any idea on how to
>>>> overcome this
>>>> problem ?
>>>
>>> Did you run a profiler to see what it is that consume all the  
>>> resources?
>>> It is very hard to guess based on the information you supplied. Start
>>> here:
>>>
>>> http://wiki.apache.org/lucene-java/BasicsOfPerformance
>>> http://wiki.apache.org/lucene-java/ImproveIndexingSpeed
>>> http://wiki.apache.org/lucene-java/ImproveSearchingSpeed
>>>
>>>
>>> -- 
>>> karl
>>>
>>> ---------------------------------------------------------------------
>>> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
>>> For additional commands, e-mail: java-user-help@lucene.apache.org
>>>
>>>
>>> No virus found in this incoming message.
>>> Checked by AVG Free Edition.
>>> Version: 7.5.476 / Virus Database: 269.11.0/927 - Release Date:  
>>> 7/30/2007
>>> 5:02 PM
>>>
>>>
>>> No virus found in this outgoing message.
>>> Checked by AVG Free Edition.
>>> Version: 7.5.476 / Virus Database: 269.11.0/929 - Release Date:  
>>> 7/31/2007
>>> 5:26 PM
>>>
>>>
>>>
>>>
>>> ---------------------------------------------------------------------
>>> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
>>> For additional commands, e-mail: java-user-help@lucene.apache.org
>>>
>> 
>> 
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
>> For additional commands, e-mail: java-user-help@lucene.apache.org
>> 
>> 
>> No virus found in this incoming message.
>> Checked by AVG Free Edition. 
>> Version: 7.5.476 / Virus Database: 269.11.2/933 - Release Date: 8/2/2007
>> 2:22 PM
>>  
>> 
>> No virus found in this outgoing message.
>> Checked by AVG Free Edition. 
>> Version: 7.5.476 / Virus Database: 269.11.8/940 - Release Date: 8/6/2007
>> 4:53 PM
>>  
>> 
>> 
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
>> For additional commands, e-mail: java-user-help@lucene.apache.org
>> 
>> 
>> 
> 
> -- 
> View this message in context:
> http://www.nabble.com/High-CPU-usage-duing-index-and-search-tf4190756.html#a12035676
> Sent from the Lucene - Java Users mailing list archive at Nabble.com.
> 
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
> 
> 
> No virus found in this incoming message.
> Checked by AVG Free Edition. 
> Version: 7.5.476 / Virus Database: 269.11.8/940 - Release Date: 8/6/2007
> 4:53 PM
>  
> 
> No virus found in this outgoing message.
> Checked by AVG Free Edition. 
> Version: 7.5.476 / Virus Database: 269.11.15/949 - Release Date: 8/12/2007
> 11:03 AM
>  
> 
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
> 
> 
> 

-- 
View this message in context: http://www.nabble.com/High-CPU-usage-duing-index-and-search-tf4190756.html#a12122714
Sent from the Lucene - Java Users mailing list archive at Nabble.com.


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


No virus found in this incoming message.
Checked by AVG Free Edition. 
Version: 7.5.476 / Virus Database: 269.11.15/949 - Release Date: 8/12/2007 11:03 AM
 

No virus found in this outgoing message.
Checked by AVG Free Edition. 
Version: 7.5.476 / Virus Database: 269.11.19/953 - Release Date: 8/14/2007 5:19 PM
 


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


RE: High CPU usage duing index and search

Posted by testn <te...@doramail.com>.
To me, it looks like what you are trying to achieve is more suitable to be
the database where it can help you do grouping and sorting, etc. But if you
still want to achieve it using lucene, you might want to post some code so
that I can go through it and see why it uses so much resource then.


Chew Yee Chuang wrote:
> 
> Hi testn,
> 
> I have tested Filter, it is pretty fast, but still take a lot of CPU
> resource, Maybe it could due to the number of filter I run.
> 
> Thank you
> eChuang, Chew
> 
> 
> -----Original Message-----
> From: testn [mailto:test1@doramail.com] 
> Sent: Tuesday, August 07, 2007 10:37 PM
> To: java-user@lucene.apache.org
> Subject: RE: High CPU usage duing index and search
> 
> 
> Check out Filter class. You can create a separate filter for each field
> and
> then chain them together using ChainFilter. If you cache the filter, it
> will
> be pretty fast. 
> 
> 
> Chew Yee Chuang wrote:
>> 
>> Greetings,
>> 
>> Yes, process a little bit and stop for a while really reduce the CPU
>> usage,
>> but I need to find out a balance so that the indexing or searching will
>> not
>> have so much delay.
>> 
>> Execute 20,000 queries at a time is because the process is generating the
>> aggregation data for reporting,
>> E.g Gender (M,F), Department (Accounting, R&D, Financial,...etc), 
>> 1Q - Gender:M AND Department: Accounting
>> 2Q - Gender:M AND Department: R&D
>> 3Q - Gender:M AND Department: Financial
>> 4Q - Gender:F AND Department: Accounting
>> 5Q - ....
>> Thus, the more combination, the more query need to run. For now, I still
>> can't get any idea on how to reduce it, just thinking maybe there is a
>> different way to index it so that I can get It easily.
>> 
>> Any help would be appreciated.
>> 
>> Thanks
>> eChuang, Chew
>> 
>> -----Original Message-----
>> From: karl wettin [mailto:karl.wettin@gmail.com] 
>> Sent: Thursday, August 02, 2007 7:11 AM
>> To: java-user@lucene.apache.org
>> Subject: Re: High CPU usage duing index and search
>> 
>> It sounds like you have a fairly busy system, perhaps 100% load on the
>> process is not that strange, at least not during short periods of time.
>> 
>> A simpler solution would be to nice the process a little bit in order to
>> give your background jobs some more time to think.
>> 
>> Running a profiler is still the best advice I can think of. It should
>> clearly show you what is going on when you run out of CPU.
>> 
>> --  
>> karl
>> 
>> 1 aug 2007 kl. 04.29 skrev Chew Yee Chuang:
>> 
>>> Hi,
>>>
>>> Thanks for the link provided, actually I've go through those  
>>> article when I
>>> developing the index and search function for my application. I  
>>> haven’t try
>>> profiler yet, but I monitor the CPU usage and notice that whatever  
>>> index or
>>> search performing, the CPU usage raise to 100%. Below I will try to
>>> elaborate more on what my application is doing and how I index and  
>>> search.
>>>
>>> There are many concurrent process running, first, the application  
>>> will write
>>> records that received into a text file with tab separated each  
>>> different
>>> field. Application will point to a new file every 10mins and start  
>>> writing
>>> to it. So every file will contains only 10mins record, approximate  
>>> 600,000
>>> records per file. Then, the indexing process will check whether  
>>> there is a
>>> text file to be index, if it is, the thread will wake up and start  
>>> perform
>>> indexing.
>>>
>>> The indexing process will first add documents to RAMDir, Then  
>>> later, add
>>> RAMDir into FSDir by calling addIndexNoOptimize() when there is  
>>> 100,000
>>> documents(32 fields per doc) in RAMDir. There is only 1 IndexWriter 
>>> (FSDir)
>>> was created but a few IndexWriter(RAMDir) was created during the whole
>>> process. Below are some configuration for IndexWriters that I  
>>> mentioned:-
>>>
>>> IndexWriter (RAMDir)
>>> - SimpleAnalyzer
>>> - setMaxBufferedDocs(10000)
>>> - Filed.Store.YES
>>> - Field.Index.NO_NORMS
>>>
>>> IndexWriter (FSDir)
>>> - SimpleAnalyzer
>>> - setMergeFactor(20)
>>> - addIndexesNoOptimize()
>>>
>>> For the searching, because there are many queries(20,000) run  
>>> continuously
>>> to generate the aggregate table for reporting purpose. All this  
>>> queries is
>>> run in nested loop, and there is only 1 Searcher created, I try  
>>> searcher and
>>> filter as well, filter give me better result, but both also utilize  
>>> lots of
>>> CPU resources.
>>>
>>> Hope this info will help and sorry for my bad English.
>>>
>>> Thanks
>>> eChuang, Chew
>>>
>>> -----Original Message-----
>>> From: karl wettin [mailto:karl.wettin@gmail.com]
>>> Sent: Tuesday, July 31, 2007 5:54 PM
>>> To: java-user@lucene.apache.org
>>> Subject: Re: High CPU usage duing index and search
>>>
>>>
>>> 31 jul 2007 kl. 05.25 skrev Chew Yee Chuang:
>>>> But just notice that when Lucene performing search or index,
>>>> the CPU usage on my machine raise to 100%, because of this issue,
>>>> some of my
>>>> others backend process will slow down eventually. Just want to know
>>>> does
>>>> anyone face this problem before ? and is it any idea on how to
>>>> overcome this
>>>> problem ?
>>>
>>> Did you run a profiler to see what it is that consume all the  
>>> resources?
>>> It is very hard to guess based on the information you supplied. Start
>>> here:
>>>
>>> http://wiki.apache.org/lucene-java/BasicsOfPerformance
>>> http://wiki.apache.org/lucene-java/ImproveIndexingSpeed
>>> http://wiki.apache.org/lucene-java/ImproveSearchingSpeed
>>>
>>>
>>> -- 
>>> karl
>>>
>>> ---------------------------------------------------------------------
>>> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
>>> For additional commands, e-mail: java-user-help@lucene.apache.org
>>>
>>>
>>> No virus found in this incoming message.
>>> Checked by AVG Free Edition.
>>> Version: 7.5.476 / Virus Database: 269.11.0/927 - Release Date:  
>>> 7/30/2007
>>> 5:02 PM
>>>
>>>
>>> No virus found in this outgoing message.
>>> Checked by AVG Free Edition.
>>> Version: 7.5.476 / Virus Database: 269.11.0/929 - Release Date:  
>>> 7/31/2007
>>> 5:26 PM
>>>
>>>
>>>
>>>
>>> ---------------------------------------------------------------------
>>> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
>>> For additional commands, e-mail: java-user-help@lucene.apache.org
>>>
>> 
>> 
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
>> For additional commands, e-mail: java-user-help@lucene.apache.org
>> 
>> 
>> No virus found in this incoming message.
>> Checked by AVG Free Edition. 
>> Version: 7.5.476 / Virus Database: 269.11.2/933 - Release Date: 8/2/2007
>> 2:22 PM
>>  
>> 
>> No virus found in this outgoing message.
>> Checked by AVG Free Edition. 
>> Version: 7.5.476 / Virus Database: 269.11.8/940 - Release Date: 8/6/2007
>> 4:53 PM
>>  
>> 
>> 
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
>> For additional commands, e-mail: java-user-help@lucene.apache.org
>> 
>> 
>> 
> 
> -- 
> View this message in context:
> http://www.nabble.com/High-CPU-usage-duing-index-and-search-tf4190756.html#a12035676
> Sent from the Lucene - Java Users mailing list archive at Nabble.com.
> 
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
> 
> 
> No virus found in this incoming message.
> Checked by AVG Free Edition. 
> Version: 7.5.476 / Virus Database: 269.11.8/940 - Release Date: 8/6/2007
> 4:53 PM
>  
> 
> No virus found in this outgoing message.
> Checked by AVG Free Edition. 
> Version: 7.5.476 / Virus Database: 269.11.15/949 - Release Date: 8/12/2007
> 11:03 AM
>  
> 
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
> 
> 
> 

-- 
View this message in context: http://www.nabble.com/High-CPU-usage-duing-index-and-search-tf4190756.html#a12122714
Sent from the Lucene - Java Users mailing list archive at Nabble.com.


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


RE: High CPU usage duing index and search

Posted by Chew Yee Chuang <ye...@tecforte.com>.
Hi testn,

I have tested Filter, it is pretty fast, but still take a lot of CPU resource, Maybe it could due to the number of filter I run.

Thank you
eChuang, Chew


-----Original Message-----
From: testn [mailto:test1@doramail.com] 
Sent: Tuesday, August 07, 2007 10:37 PM
To: java-user@lucene.apache.org
Subject: RE: High CPU usage duing index and search


Check out Filter class. You can create a separate filter for each field and
then chain them together using ChainFilter. If you cache the filter, it will
be pretty fast. 


Chew Yee Chuang wrote:
> 
> Greetings,
> 
> Yes, process a little bit and stop for a while really reduce the CPU
> usage,
> but I need to find out a balance so that the indexing or searching will
> not
> have so much delay.
> 
> Execute 20,000 queries at a time is because the process is generating the
> aggregation data for reporting,
> E.g Gender (M,F), Department (Accounting, R&D, Financial,...etc), 
> 1Q - Gender:M AND Department: Accounting
> 2Q - Gender:M AND Department: R&D
> 3Q - Gender:M AND Department: Financial
> 4Q - Gender:F AND Department: Accounting
> 5Q - ....
> Thus, the more combination, the more query need to run. For now, I still
> can't get any idea on how to reduce it, just thinking maybe there is a
> different way to index it so that I can get It easily.
> 
> Any help would be appreciated.
> 
> Thanks
> eChuang, Chew
> 
> -----Original Message-----
> From: karl wettin [mailto:karl.wettin@gmail.com] 
> Sent: Thursday, August 02, 2007 7:11 AM
> To: java-user@lucene.apache.org
> Subject: Re: High CPU usage duing index and search
> 
> It sounds like you have a fairly busy system, perhaps 100% load on the
> process is not that strange, at least not during short periods of time.
> 
> A simpler solution would be to nice the process a little bit in order to
> give your background jobs some more time to think.
> 
> Running a profiler is still the best advice I can think of. It should
> clearly show you what is going on when you run out of CPU.
> 
> --  
> karl
> 
> 1 aug 2007 kl. 04.29 skrev Chew Yee Chuang:
> 
>> Hi,
>>
>> Thanks for the link provided, actually I've go through those  
>> article when I
>> developing the index and search function for my application. I  
>> haven’t try
>> profiler yet, but I monitor the CPU usage and notice that whatever  
>> index or
>> search performing, the CPU usage raise to 100%. Below I will try to
>> elaborate more on what my application is doing and how I index and  
>> search.
>>
>> There are many concurrent process running, first, the application  
>> will write
>> records that received into a text file with tab separated each  
>> different
>> field. Application will point to a new file every 10mins and start  
>> writing
>> to it. So every file will contains only 10mins record, approximate  
>> 600,000
>> records per file. Then, the indexing process will check whether  
>> there is a
>> text file to be index, if it is, the thread will wake up and start  
>> perform
>> indexing.
>>
>> The indexing process will first add documents to RAMDir, Then  
>> later, add
>> RAMDir into FSDir by calling addIndexNoOptimize() when there is  
>> 100,000
>> documents(32 fields per doc) in RAMDir. There is only 1 IndexWriter 
>> (FSDir)
>> was created but a few IndexWriter(RAMDir) was created during the whole
>> process. Below are some configuration for IndexWriters that I  
>> mentioned:-
>>
>> IndexWriter (RAMDir)
>> - SimpleAnalyzer
>> - setMaxBufferedDocs(10000)
>> - Filed.Store.YES
>> - Field.Index.NO_NORMS
>>
>> IndexWriter (FSDir)
>> - SimpleAnalyzer
>> - setMergeFactor(20)
>> - addIndexesNoOptimize()
>>
>> For the searching, because there are many queries(20,000) run  
>> continuously
>> to generate the aggregate table for reporting purpose. All this  
>> queries is
>> run in nested loop, and there is only 1 Searcher created, I try  
>> searcher and
>> filter as well, filter give me better result, but both also utilize  
>> lots of
>> CPU resources.
>>
>> Hope this info will help and sorry for my bad English.
>>
>> Thanks
>> eChuang, Chew
>>
>> -----Original Message-----
>> From: karl wettin [mailto:karl.wettin@gmail.com]
>> Sent: Tuesday, July 31, 2007 5:54 PM
>> To: java-user@lucene.apache.org
>> Subject: Re: High CPU usage duing index and search
>>
>>
>> 31 jul 2007 kl. 05.25 skrev Chew Yee Chuang:
>>> But just notice that when Lucene performing search or index,
>>> the CPU usage on my machine raise to 100%, because of this issue,
>>> some of my
>>> others backend process will slow down eventually. Just want to know
>>> does
>>> anyone face this problem before ? and is it any idea on how to
>>> overcome this
>>> problem ?
>>
>> Did you run a profiler to see what it is that consume all the  
>> resources?
>> It is very hard to guess based on the information you supplied. Start
>> here:
>>
>> http://wiki.apache.org/lucene-java/BasicsOfPerformance
>> http://wiki.apache.org/lucene-java/ImproveIndexingSpeed
>> http://wiki.apache.org/lucene-java/ImproveSearchingSpeed
>>
>>
>> -- 
>> karl
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
>> For additional commands, e-mail: java-user-help@lucene.apache.org
>>
>>
>> No virus found in this incoming message.
>> Checked by AVG Free Edition.
>> Version: 7.5.476 / Virus Database: 269.11.0/927 - Release Date:  
>> 7/30/2007
>> 5:02 PM
>>
>>
>> No virus found in this outgoing message.
>> Checked by AVG Free Edition.
>> Version: 7.5.476 / Virus Database: 269.11.0/929 - Release Date:  
>> 7/31/2007
>> 5:26 PM
>>
>>
>>
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
>> For additional commands, e-mail: java-user-help@lucene.apache.org
>>
> 
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
> 
> 
> No virus found in this incoming message.
> Checked by AVG Free Edition. 
> Version: 7.5.476 / Virus Database: 269.11.2/933 - Release Date: 8/2/2007
> 2:22 PM
>  
> 
> No virus found in this outgoing message.
> Checked by AVG Free Edition. 
> Version: 7.5.476 / Virus Database: 269.11.8/940 - Release Date: 8/6/2007
> 4:53 PM
>  
> 
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
> 
> 
> 

-- 
View this message in context: http://www.nabble.com/High-CPU-usage-duing-index-and-search-tf4190756.html#a12035676
Sent from the Lucene - Java Users mailing list archive at Nabble.com.


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


No virus found in this incoming message.
Checked by AVG Free Edition. 
Version: 7.5.476 / Virus Database: 269.11.8/940 - Release Date: 8/6/2007 4:53 PM
 

No virus found in this outgoing message.
Checked by AVG Free Edition. 
Version: 7.5.476 / Virus Database: 269.11.15/949 - Release Date: 8/12/2007 11:03 AM
 


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


RE: High CPU usage duing index and search

Posted by testn <te...@doramail.com>.
Check out Filter class. You can create a separate filter for each field and
then chain them together using ChainFilter. If you cache the filter, it will
be pretty fast. 


Chew Yee Chuang wrote:
> 
> Greetings,
> 
> Yes, process a little bit and stop for a while really reduce the CPU
> usage,
> but I need to find out a balance so that the indexing or searching will
> not
> have so much delay.
> 
> Execute 20,000 queries at a time is because the process is generating the
> aggregation data for reporting,
> E.g Gender (M,F), Department (Accounting, R&D, Financial,...etc), 
> 1Q - Gender:M AND Department: Accounting
> 2Q - Gender:M AND Department: R&D
> 3Q - Gender:M AND Department: Financial
> 4Q - Gender:F AND Department: Accounting
> 5Q - ....
> Thus, the more combination, the more query need to run. For now, I still
> can't get any idea on how to reduce it, just thinking maybe there is a
> different way to index it so that I can get It easily.
> 
> Any help would be appreciated.
> 
> Thanks
> eChuang, Chew
> 
> -----Original Message-----
> From: karl wettin [mailto:karl.wettin@gmail.com] 
> Sent: Thursday, August 02, 2007 7:11 AM
> To: java-user@lucene.apache.org
> Subject: Re: High CPU usage duing index and search
> 
> It sounds like you have a fairly busy system, perhaps 100% load on the
> process is not that strange, at least not during short periods of time.
> 
> A simpler solution would be to nice the process a little bit in order to
> give your background jobs some more time to think.
> 
> Running a profiler is still the best advice I can think of. It should
> clearly show you what is going on when you run out of CPU.
> 
> --  
> karl
> 
> 1 aug 2007 kl. 04.29 skrev Chew Yee Chuang:
> 
>> Hi,
>>
>> Thanks for the link provided, actually I've go through those  
>> article when I
>> developing the index and search function for my application. I  
>> haven’t try
>> profiler yet, but I monitor the CPU usage and notice that whatever  
>> index or
>> search performing, the CPU usage raise to 100%. Below I will try to
>> elaborate more on what my application is doing and how I index and  
>> search.
>>
>> There are many concurrent process running, first, the application  
>> will write
>> records that received into a text file with tab separated each  
>> different
>> field. Application will point to a new file every 10mins and start  
>> writing
>> to it. So every file will contains only 10mins record, approximate  
>> 600,000
>> records per file. Then, the indexing process will check whether  
>> there is a
>> text file to be index, if it is, the thread will wake up and start  
>> perform
>> indexing.
>>
>> The indexing process will first add documents to RAMDir, Then  
>> later, add
>> RAMDir into FSDir by calling addIndexNoOptimize() when there is  
>> 100,000
>> documents(32 fields per doc) in RAMDir. There is only 1 IndexWriter 
>> (FSDir)
>> was created but a few IndexWriter(RAMDir) was created during the whole
>> process. Below are some configuration for IndexWriters that I  
>> mentioned:-
>>
>> IndexWriter (RAMDir)
>> - SimpleAnalyzer
>> - setMaxBufferedDocs(10000)
>> - Filed.Store.YES
>> - Field.Index.NO_NORMS
>>
>> IndexWriter (FSDir)
>> - SimpleAnalyzer
>> - setMergeFactor(20)
>> - addIndexesNoOptimize()
>>
>> For the searching, because there are many queries(20,000) run  
>> continuously
>> to generate the aggregate table for reporting purpose. All this  
>> queries is
>> run in nested loop, and there is only 1 Searcher created, I try  
>> searcher and
>> filter as well, filter give me better result, but both also utilize  
>> lots of
>> CPU resources.
>>
>> Hope this info will help and sorry for my bad English.
>>
>> Thanks
>> eChuang, Chew
>>
>> -----Original Message-----
>> From: karl wettin [mailto:karl.wettin@gmail.com]
>> Sent: Tuesday, July 31, 2007 5:54 PM
>> To: java-user@lucene.apache.org
>> Subject: Re: High CPU usage duing index and search
>>
>>
>> 31 jul 2007 kl. 05.25 skrev Chew Yee Chuang:
>>> But just notice that when Lucene performing search or index,
>>> the CPU usage on my machine raise to 100%, because of this issue,
>>> some of my
>>> others backend process will slow down eventually. Just want to know
>>> does
>>> anyone face this problem before ? and is it any idea on how to
>>> overcome this
>>> problem ?
>>
>> Did you run a profiler to see what it is that consume all the  
>> resources?
>> It is very hard to guess based on the information you supplied. Start
>> here:
>>
>> http://wiki.apache.org/lucene-java/BasicsOfPerformance
>> http://wiki.apache.org/lucene-java/ImproveIndexingSpeed
>> http://wiki.apache.org/lucene-java/ImproveSearchingSpeed
>>
>>
>> -- 
>> karl
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
>> For additional commands, e-mail: java-user-help@lucene.apache.org
>>
>>
>> No virus found in this incoming message.
>> Checked by AVG Free Edition.
>> Version: 7.5.476 / Virus Database: 269.11.0/927 - Release Date:  
>> 7/30/2007
>> 5:02 PM
>>
>>
>> No virus found in this outgoing message.
>> Checked by AVG Free Edition.
>> Version: 7.5.476 / Virus Database: 269.11.0/929 - Release Date:  
>> 7/31/2007
>> 5:26 PM
>>
>>
>>
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
>> For additional commands, e-mail: java-user-help@lucene.apache.org
>>
> 
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
> 
> 
> No virus found in this incoming message.
> Checked by AVG Free Edition. 
> Version: 7.5.476 / Virus Database: 269.11.2/933 - Release Date: 8/2/2007
> 2:22 PM
>  
> 
> No virus found in this outgoing message.
> Checked by AVG Free Edition. 
> Version: 7.5.476 / Virus Database: 269.11.8/940 - Release Date: 8/6/2007
> 4:53 PM
>  
> 
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
> 
> 
> 

-- 
View this message in context: http://www.nabble.com/High-CPU-usage-duing-index-and-search-tf4190756.html#a12035676
Sent from the Lucene - Java Users mailing list archive at Nabble.com.


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


RE: High CPU usage duing index and search

Posted by Chew Yee Chuang <ye...@tecforte.com>.
Greetings,

Yes, process a little bit and stop for a while really reduce the CPU usage,
but I need to find out a balance so that the indexing or searching will not
have so much delay.

Execute 20,000 queries at a time is because the process is generating the
aggregation data for reporting,
E.g Gender (M,F), Department (Accounting, R&D, Financial,...etc), 
1Q - Gender:M AND Department: Accounting
2Q - Gender:M AND Department: R&D
3Q - Gender:M AND Department: Financial
4Q - Gender:F AND Department: Accounting
5Q - ....
Thus, the more combination, the more query need to run. For now, I still
can't get any idea on how to reduce it, just thinking maybe there is a
different way to index it so that I can get It easily.

Any help would be appreciated.

Thanks
eChuang, Chew

-----Original Message-----
From: karl wettin [mailto:karl.wettin@gmail.com] 
Sent: Thursday, August 02, 2007 7:11 AM
To: java-user@lucene.apache.org
Subject: Re: High CPU usage duing index and search

It sounds like you have a fairly busy system, perhaps 100% load on the
process is not that strange, at least not during short periods of time.

A simpler solution would be to nice the process a little bit in order to
give your background jobs some more time to think.

Running a profiler is still the best advice I can think of. It should
clearly show you what is going on when you run out of CPU.

--  
karl

1 aug 2007 kl. 04.29 skrev Chew Yee Chuang:

> Hi,
>
> Thanks for the link provided, actually I've go through those  
> article when I
> developing the index and search function for my application. I  
> haven’t try
> profiler yet, but I monitor the CPU usage and notice that whatever  
> index or
> search performing, the CPU usage raise to 100%. Below I will try to
> elaborate more on what my application is doing and how I index and  
> search.
>
> There are many concurrent process running, first, the application  
> will write
> records that received into a text file with tab separated each  
> different
> field. Application will point to a new file every 10mins and start  
> writing
> to it. So every file will contains only 10mins record, approximate  
> 600,000
> records per file. Then, the indexing process will check whether  
> there is a
> text file to be index, if it is, the thread will wake up and start  
> perform
> indexing.
>
> The indexing process will first add documents to RAMDir, Then  
> later, add
> RAMDir into FSDir by calling addIndexNoOptimize() when there is  
> 100,000
> documents(32 fields per doc) in RAMDir. There is only 1 IndexWriter 
> (FSDir)
> was created but a few IndexWriter(RAMDir) was created during the whole
> process. Below are some configuration for IndexWriters that I  
> mentioned:-
>
> IndexWriter (RAMDir)
> - SimpleAnalyzer
> - setMaxBufferedDocs(10000)
> - Filed.Store.YES
> - Field.Index.NO_NORMS
>
> IndexWriter (FSDir)
> - SimpleAnalyzer
> - setMergeFactor(20)
> - addIndexesNoOptimize()
>
> For the searching, because there are many queries(20,000) run  
> continuously
> to generate the aggregate table for reporting purpose. All this  
> queries is
> run in nested loop, and there is only 1 Searcher created, I try  
> searcher and
> filter as well, filter give me better result, but both also utilize  
> lots of
> CPU resources.
>
> Hope this info will help and sorry for my bad English.
>
> Thanks
> eChuang, Chew
>
> -----Original Message-----
> From: karl wettin [mailto:karl.wettin@gmail.com]
> Sent: Tuesday, July 31, 2007 5:54 PM
> To: java-user@lucene.apache.org
> Subject: Re: High CPU usage duing index and search
>
>
> 31 jul 2007 kl. 05.25 skrev Chew Yee Chuang:
>> But just notice that when Lucene performing search or index,
>> the CPU usage on my machine raise to 100%, because of this issue,
>> some of my
>> others backend process will slow down eventually. Just want to know
>> does
>> anyone face this problem before ? and is it any idea on how to
>> overcome this
>> problem ?
>
> Did you run a profiler to see what it is that consume all the  
> resources?
> It is very hard to guess based on the information you supplied. Start
> here:
>
> http://wiki.apache.org/lucene-java/BasicsOfPerformance
> http://wiki.apache.org/lucene-java/ImproveIndexingSpeed
> http://wiki.apache.org/lucene-java/ImproveSearchingSpeed
>
>
> -- 
> karl
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
>
>
> No virus found in this incoming message.
> Checked by AVG Free Edition.
> Version: 7.5.476 / Virus Database: 269.11.0/927 - Release Date:  
> 7/30/2007
> 5:02 PM
>
>
> No virus found in this outgoing message.
> Checked by AVG Free Edition.
> Version: 7.5.476 / Virus Database: 269.11.0/929 - Release Date:  
> 7/31/2007
> 5:26 PM
>
>
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
>


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


No virus found in this incoming message.
Checked by AVG Free Edition. 
Version: 7.5.476 / Virus Database: 269.11.2/933 - Release Date: 8/2/2007
2:22 PM
 

No virus found in this outgoing message.
Checked by AVG Free Edition. 
Version: 7.5.476 / Virus Database: 269.11.8/940 - Release Date: 8/6/2007
4:53 PM
 


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Re: High CPU usage duing index and search

Posted by karl wettin <ka...@gmail.com>.
It sounds like you have a fairly busy system, perhaps 100% load on the
process is not that strange, at least not during short periods of time.

A simpler solution would be to nice the process a little bit in order to
give your background jobs some more time to think.

Running a profiler is still the best advice I can think of. It should
clearly show you what is going on when you run out of CPU.

--  
karl

1 aug 2007 kl. 04.29 skrev Chew Yee Chuang:

> Hi,
>
> Thanks for the link provided, actually I've go through those  
> article when I
> developing the index and search function for my application. I  
> haven’t try
> profiler yet, but I monitor the CPU usage and notice that whatever  
> index or
> search performing, the CPU usage raise to 100%. Below I will try to
> elaborate more on what my application is doing and how I index and  
> search.
>
> There are many concurrent process running, first, the application  
> will write
> records that received into a text file with tab separated each  
> different
> field. Application will point to a new file every 10mins and start  
> writing
> to it. So every file will contains only 10mins record, approximate  
> 600,000
> records per file. Then, the indexing process will check whether  
> there is a
> text file to be index, if it is, the thread will wake up and start  
> perform
> indexing.
>
> The indexing process will first add documents to RAMDir, Then  
> later, add
> RAMDir into FSDir by calling addIndexNoOptimize() when there is  
> 100,000
> documents(32 fields per doc) in RAMDir. There is only 1 IndexWriter 
> (FSDir)
> was created but a few IndexWriter(RAMDir) was created during the whole
> process. Below are some configuration for IndexWriters that I  
> mentioned:-
>
> IndexWriter (RAMDir)
> - SimpleAnalyzer
> - setMaxBufferedDocs(10000)
> - Filed.Store.YES
> - Field.Index.NO_NORMS
>
> IndexWriter (FSDir)
> - SimpleAnalyzer
> - setMergeFactor(20)
> - addIndexesNoOptimize()
>
> For the searching, because there are many queries(20,000) run  
> continuously
> to generate the aggregate table for reporting purpose. All this  
> queries is
> run in nested loop, and there is only 1 Searcher created, I try  
> searcher and
> filter as well, filter give me better result, but both also utilize  
> lots of
> CPU resources.
>
> Hope this info will help and sorry for my bad English.
>
> Thanks
> eChuang, Chew
>
> -----Original Message-----
> From: karl wettin [mailto:karl.wettin@gmail.com]
> Sent: Tuesday, July 31, 2007 5:54 PM
> To: java-user@lucene.apache.org
> Subject: Re: High CPU usage duing index and search
>
>
> 31 jul 2007 kl. 05.25 skrev Chew Yee Chuang:
>> But just notice that when Lucene performing search or index,
>> the CPU usage on my machine raise to 100%, because of this issue,
>> some of my
>> others backend process will slow down eventually. Just want to know
>> does
>> anyone face this problem before ? and is it any idea on how to
>> overcome this
>> problem ?
>
> Did you run a profiler to see what it is that consume all the  
> resources?
> It is very hard to guess based on the information you supplied. Start
> here:
>
> http://wiki.apache.org/lucene-java/BasicsOfPerformance
> http://wiki.apache.org/lucene-java/ImproveIndexingSpeed
> http://wiki.apache.org/lucene-java/ImproveSearchingSpeed
>
>
> -- 
> karl
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
>
>
> No virus found in this incoming message.
> Checked by AVG Free Edition.
> Version: 7.5.476 / Virus Database: 269.11.0/927 - Release Date:  
> 7/30/2007
> 5:02 PM
>
>
> No virus found in this outgoing message.
> Checked by AVG Free Edition.
> Version: 7.5.476 / Virus Database: 269.11.0/929 - Release Date:  
> 7/31/2007
> 5:26 PM
>
>
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
>


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org