You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@lucenenet.apache.org by juliya james <ju...@yahoo.co.in.INVALID> on 2018/04/04 12:51:21 UTC

Worsening of indexing performance with Lucene .Net 4.8_beta005 when compared with equivalent Java version and Lucene .Net 3.0.3

Hi,
We are observing substantial performance worsening while indexing with Lucene .Net 4.8_beta005, when compared with equivalent Java version and Lucene .Net 3.0.3. 
The table below shows the comparison of  indexing time with different Lucene versions for different index sizes.As you can see, indexing time has increased by almost x2 with .Net 4.8_beta005, especially with bigger index sizes.
Are there any known issues related to indexing performance with Lucene .Net 4.8? Or, is there any explanation for such a behavior? 

| # | Lucene .Net 4.8_beta005 | Lucene Java 4.8.0 | Lucene .Net 3.0.3 |
|   | Index size(MB)  | IndexingTime(s) | Index size(MB)  | IndexingTime(s) | Index size(MB)  | IndexingTime(s) |
| 1 | 5.4 | 3 | 5 | 2 | 8.2 | 1 |
| 2 | 27.46 | 14 | 25 | 8 | 40.6 | 8 |
| 3 | 41.32 | 21 | 32 | 13 | 58.49 | 12 |
| 4 | 47.66 | 32 | 45 | 15 | 78.85 | 16 |
| 5 | 95.3 | 60 | 90 | 25 | 157.3 | 33 |
| 6 | 238.14 | 143 | 221 | 62 | 388.15 | 82 |
| 7 | 476.4 | 282 | 385 | 140 | 771.09 | 169 |


Note: - The quantity of data given for indexing(input) is the same for the measurements shown in a row: Only the Lucene versions used were changed.  Data was split to several documents, each document may have ~1MB of data.  Most of the data was indexed with the field property [Field.Store.NO, Field.Index.ANALYZED_NO_NORMS]- "Index Size" column shows the size of index generated(in MB) and "Indexing Time" column shows the time taken to index that (in seconds).
Thanks & Regards,Juliya





RE: Worsening of indexing performance with Lucene .Net 4.8_beta005 when compared with equivalent Java version and Lucene .Net 3.0.3

Posted by Shad Storhaug <sh...@shadstorhaug.com>.
Juliya,

Thanks for the report.

Could you put together an integration test or a console application that demonstrates how you have Lucene.Net configured to see these performance issues when compared with Lucene.Net 3.0.3 and Lucene 4.8.0? There are many factors at play including which Codec, which Analyzer(s), and which data types you are using that may have less than optimal code, so we would need more information to narrow it down.

Do note that the IndexWriter was designed to throw exceptions and catch exceptions at various levels in the execution stack as a way to control the flow of the application, which is sure to make Lucene.Net somewhat less optimal than Lucene, especially when debugging. This probably doesn't explain all of the difference you are seeing, though.

Thanks,
Shad Storhaug (NightOwl888)


-----Original Message-----
From: juliya james [mailto:juliyamj@yahoo.co.in.INVALID] 
Sent: Monday, June 4, 2018 4:56 PM
To: dev@lucenenet.apache.org; mikemccand@apache.org; ehatcher@apache.org
Subject: Re: Worsening of indexing performance with Lucene .Net 4.8_beta005 when compared with equivalent Java version and Lucene .Net 3.0.3

Hi All,
Any info on the performance problem with Lucene .Net 4.8.0 mentioned in the previous email would be appreciable.
Thanks & Regards,Juliya
============================================================================Hi,
We are observing substantial performance worsening while indexing with Lucene .Net 4.8_beta005, when compared with equivalent Java version and Lucene .Net 3.0.3. 
The table below shows the comparison of  indexing time with different Lucene versions for different index sizes.As you can see, indexing time has increased by almost x2 with .Net 4.8_beta005, especially with bigger index sizes.
Are there any known issues related to indexing performance with Lucene .Net 4.8? Or, is there any explanation for such a behavior? 

| # | Lucene .Net 4.8_beta005 | Lucene Java 4.8.0 | Lucene .Net 3.0.3 |
|   | Index size(MB)  | IndexingTime(s) | Index size(MB)  | IndexingTime(s) | Index size(MB)  | IndexingTime(s) |
| 1 | 5.4 | 3 | 5 | 2 | 8.2 | 1 |
| 2 | 27.46 | 14 | 25 | 8 | 40.6 | 8 |
| 3 | 41.32 | 21 | 32 | 13 | 58.49 | 12 |
| 4 | 47.66 | 32 | 45 | 15 | 78.85 | 16 |
| 5 | 95.3 | 60 | 90 | 25 | 157.3 | 33 |
| 6 | 238.14 | 143 | 221 | 62 | 388.15 | 82 |
| 7 | 476.4 | 282 | 385 | 140 | 771.09 | 169 |


Note: - The quantity of data given for indexing(input) is the same for the measurements shown in a row: Only the Lucene versions used were changed.  Data was split to several documents, each document may have ~1MB of data.  Most of the data was indexed with the field property [Field.Store.NO, Field.Index.ANALYZED_NO_NORMS]- "Index Size" column shows the size of index generated(in MB) and "Indexing Time" column shows the time taken to index that (in seconds).
Thanks & Regards,Juliya
 

    On Wednesday, April 4, 2018 6:21 PM, juliya james <ju...@yahoo.co.in> wrote:
 

 Hi,
We are observing substantial performance worsening while indexing with Lucene .Net 4.8_beta005, when compared with equivalent Java version and Lucene .Net 3.0.3. 
The table below shows the comparison of  indexing time with different Lucene versions for different index sizes.As you can see, indexing time has increased by almost x2 with .Net 4.8_beta005, especially with bigger index sizes.
Are there any known issues related to indexing performance with Lucene .Net 4.8? Or, is there any explanation for such a behavior? 

| # | Lucene .Net 4.8_beta005 | Lucene Java 4.8.0 | Lucene .Net 3.0.3 |
|   | Index size(MB)  | IndexingTime(s) | Index size(MB)  | IndexingTime(s) | Index size(MB)  | IndexingTime(s) |
| 1 | 5.4 | 3 | 5 | 2 | 8.2 | 1 |
| 2 | 27.46 | 14 | 25 | 8 | 40.6 | 8 |
| 3 | 41.32 | 21 | 32 | 13 | 58.49 | 12 |
| 4 | 47.66 | 32 | 45 | 15 | 78.85 | 16 |
| 5 | 95.3 | 60 | 90 | 25 | 157.3 | 33 |
| 6 | 238.14 | 143 | 221 | 62 | 388.15 | 82 |
| 7 | 476.4 | 282 | 385 | 140 | 771.09 | 169 |


Note: - The quantity of data given for indexing(input) is the same for the measurements shown in a row: Only the Lucene versions used were changed.  Data was split to several documents, each document may have ~1MB of data.  Most of the data was indexed with the field property [Field.Store.NO, Field.Index.ANALYZED_NO_NORMS]- "Index Size" column shows the size of index generated(in MB) and "Indexing Time" column shows the time taken to index that (in seconds).
Thanks & Regards,Juliya






   

Re: Worsening of indexing performance with Lucene .Net 4.8_beta005 when compared with equivalent Java version and Lucene .Net 3.0.3

Posted by juliya james <ju...@yahoo.co.in.INVALID>.
Hi All,
Any info on the performance problem with Lucene .Net 4.8.0 mentioned in the previous email would be appreciable.
Thanks & Regards,Juliya
============================================================================Hi,
We are observing substantial performance worsening while indexing with Lucene .Net 4.8_beta005, when compared with equivalent Java version and Lucene .Net 3.0.3. 
The table below shows the comparison of  indexing time with different Lucene versions for different index sizes.As you can see, indexing time has increased by almost x2 with .Net 4.8_beta005, especially with bigger index sizes.
Are there any known issues related to indexing performance with Lucene .Net 4.8? Or, is there any explanation for such a behavior? 

| # | Lucene .Net 4.8_beta005 | Lucene Java 4.8.0 | Lucene .Net 3.0.3 |
|   | Index size(MB)  | IndexingTime(s) | Index size(MB)  | IndexingTime(s) | Index size(MB)  | IndexingTime(s) |
| 1 | 5.4 | 3 | 5 | 2 | 8.2 | 1 |
| 2 | 27.46 | 14 | 25 | 8 | 40.6 | 8 |
| 3 | 41.32 | 21 | 32 | 13 | 58.49 | 12 |
| 4 | 47.66 | 32 | 45 | 15 | 78.85 | 16 |
| 5 | 95.3 | 60 | 90 | 25 | 157.3 | 33 |
| 6 | 238.14 | 143 | 221 | 62 | 388.15 | 82 |
| 7 | 476.4 | 282 | 385 | 140 | 771.09 | 169 |


Note: - The quantity of data given for indexing(input) is the same for the measurements shown in a row: Only the Lucene versions used were changed.  Data was split to several documents, each document may have ~1MB of data.  Most of the data was indexed with the field property [Field.Store.NO, Field.Index.ANALYZED_NO_NORMS]- "Index Size" column shows the size of index generated(in MB) and "Indexing Time" column shows the time taken to index that (in seconds).
Thanks & Regards,Juliya
 

    On Wednesday, April 4, 2018 6:21 PM, juliya james <ju...@yahoo.co.in> wrote:
 

 Hi,
We are observing substantial performance worsening while indexing with Lucene .Net 4.8_beta005, when compared with equivalent Java version and Lucene .Net 3.0.3. 
The table below shows the comparison of  indexing time with different Lucene versions for different index sizes.As you can see, indexing time has increased by almost x2 with .Net 4.8_beta005, especially with bigger index sizes.
Are there any known issues related to indexing performance with Lucene .Net 4.8? Or, is there any explanation for such a behavior? 

| # | Lucene .Net 4.8_beta005 | Lucene Java 4.8.0 | Lucene .Net 3.0.3 |
|   | Index size(MB)  | IndexingTime(s) | Index size(MB)  | IndexingTime(s) | Index size(MB)  | IndexingTime(s) |
| 1 | 5.4 | 3 | 5 | 2 | 8.2 | 1 |
| 2 | 27.46 | 14 | 25 | 8 | 40.6 | 8 |
| 3 | 41.32 | 21 | 32 | 13 | 58.49 | 12 |
| 4 | 47.66 | 32 | 45 | 15 | 78.85 | 16 |
| 5 | 95.3 | 60 | 90 | 25 | 157.3 | 33 |
| 6 | 238.14 | 143 | 221 | 62 | 388.15 | 82 |
| 7 | 476.4 | 282 | 385 | 140 | 771.09 | 169 |


Note: - The quantity of data given for indexing(input) is the same for the measurements shown in a row: Only the Lucene versions used were changed.  Data was split to several documents, each document may have ~1MB of data.  Most of the data was indexed with the field property [Field.Store.NO, Field.Index.ANALYZED_NO_NORMS]- "Index Size" column shows the size of index generated(in MB) and "Indexing Time" column shows the time taken to index that (in seconds).
Thanks & Regards,Juliya