You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@lucenenet.apache.org by GitBox <gi...@apache.org> on 2021/12/27 18:31:50 UTC

[GitHub] [lucenenet] ahmadmdabit opened a new issue #591: Minimize the HDD load

ahmadmdabit opened a new issue #591:
URL: https://github.com/apache/lucenenet/issues/591


   I have a C# .NET project uses Lucene.Net@4.8.0-beta00013 package. However, It should work fine on computers have HDD or SSD disk, but as I tested when doing search in the index the HDD load too much, sometimes the application won't respond. The problem appears on HDD not on SSD.
   
   **I want to minimize the HDD load, any idea? please..**
   
   Note: I provide you with the LuceneIndexer class which I use.
   
   Best Regards
   
   [This is the gist which contain the used classes](https://gist.github.com/ahmadmdabit/55744c840816faf35f468614c7746ac8)


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscribe@lucenenet.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [lucenenet] rclabo commented on issue #591: Minimize the HDD load

Posted by GitBox <gi...@apache.org>.
rclabo commented on issue #591:
URL: https://github.com/apache/lucenenet/issues/591#issuecomment-1002790795


   I agree that the #1 reason that Shad listed is almost certainly the issue.  If you are opening and closing the reader frequently it can be a huge performance hit as all caches will need to be rebuild including any fieldCaches used for sorting or other LuceneNet features.  In fact, general guidance is to not only use a single `IndexReader` or single `IndexWriter` ie `IndexWriter.GetReader()` **but also** to perform a few typical queries on that object at system startup to "warm up the system" and give it a chance to build those caches.  So, if instead the system closes and reopens the reader each search, Then the system is basically paying that "warm up" penalty for every search, over and over again.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscribe@lucenenet.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [lucenenet] NightOwl888 closed issue #591: Minimize the HDD load

Posted by GitBox <gi...@apache.org>.
NightOwl888 closed issue #591:
URL: https://github.com/apache/lucenenet/issues/591


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscribe@lucenenet.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [lucenenet] ahmadmdabit commented on issue #591: Minimize the HDD load

Posted by GitBox <gi...@apache.org>.
ahmadmdabit commented on issue #591:
URL: https://github.com/apache/lucenenet/issues/591#issuecomment-1001844190


   Hi.
   _I thank you very much for your quick response. I will take care of everything you told me about._
   My Regards.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscribe@lucenenet.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [lucenenet] NightOwl888 commented on issue #591: Minimize the HDD load

Posted by GitBox <gi...@apache.org>.
NightOwl888 commented on issue #591:
URL: https://github.com/apache/lucenenet/issues/591#issuecomment-1001792041


   Hi. 
   
   There are a few things I spotted in the code that could be causing performance issues.
   
   1. You are opening and closing `IndexReader` within a single search method. It works best to either use a singleton `IndexReader` instance (application wide) or to use a singleton `IndexWriter` instance and call `IndexWriter.GetReader()` as needed to do searches (assuming your app needs to do both operations simultaneously). Both of these classes are completely thread safe and are designed to service requests in parallel. This is likely causing most of your problem because opening and closing them continually will cause a lot of disk activity.
   2. You are using high compression. This will definitely run slower than using normal compression.
   3. You are calling `GC.Collect()` and `GC.WaitForPendingFinalizers()`. In general, most .NET applications don't need to call the GC directly except to disable finalizers. Do note that there aren't many classes in Lucene.NET that use finalizers, so it is unclear why you are doing this.
   4. You are using `SimpleFSDirectory`, which uses the default `FileStream` buffer size. In general, using `MMapDirectory` will provide better performance.
   
   While it may not be affecting the performance much, the `List.GetRange()` method copies the elements to a new list. If you switch to using a `J2N.Collections.Generic.List<T>`, there is a [`.GetView()`](https://github.com/NightOwl888/J2N/blob/b7f80ca424af2b077b5d42beff4d9fbad8b05ef5/src/J2N/Collections/Generic/List.cs#L227) method that passes a bounded subset over the *same* elements of the list instead of copying them to a new list.
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscribe@lucenenet.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [lucenenet] NightOwl888 commented on issue #591: Minimize the HDD load

Posted by GitBox <gi...@apache.org>.
NightOwl888 commented on issue #591:
URL: https://github.com/apache/lucenenet/issues/591#issuecomment-1006900379


   This question appears to be answered satisfactorily. Let us know if you need further assistance.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscribe@lucenenet.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org