You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@cocoon.apache.org by "Antonio Gallardo (JIRA)" <ji...@apache.org> on 2007/07/23 02:23:31 UTC

[jira] Reopened: (COCOON-2065) huge performance increase of LuceneIndexTransformer on large Lucene indexes

     [ https://issues.apache.org/jira/browse/COCOON-2065?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Antonio Gallardo reopened COCOON-2065:
--------------------------------------


Patch was not applied in cocon 2.1.11-dev.

> huge performance increase of LuceneIndexTransformer on large Lucene indexes
> ---------------------------------------------------------------------------
>
>                 Key: COCOON-2065
>                 URL: https://issues.apache.org/jira/browse/COCOON-2065
>             Project: Cocoon
>          Issue Type: Improvement
>          Components: Blocks: Lucene
>    Affects Versions: 2.1.6, 2.1.7, 2.1.8, 2.1.9, 2.1.10, 2.1.11-dev (Current SVN), 2.2-dev (Current SVN)
>            Reporter: Dominique De Munck
>            Assignee: Felix Knecht
>            Priority: Minor
>             Fix For: 2.1.11-dev (Current SVN), 2.2-dev (Current SVN)
>
>         Attachments: LuceneIndexTransformer.patch
>
>
> PROBLEM:
> The LuceneIndexTransformer optimizes the Lucene index every time you add an entry to the index.
> This slows down enormously the indexing with a large index ! If upon every checkin of a document eg,
> you use it to update the entry, it will slow down.
> Eg. I have a Pentium IV 2.4 Ghz, Lucene index contains 10 000 doc.
> Where the index update only takes say 60ms, the optimize that get's called, can take 7 seconds!
> SOLUTION:
> I've created a patch that introduces an option "optimize-frequency" to determine the frequency of the optimize call.
> It defaults to 1 (current behaviour), when a user sets it to 50, only once every 50 updates the index will be optimized etc....
> If no optimization is wanted, you can set it to 0.
> This is compliant to the Lucene documentation (fragment of Lucene FAQ):
> "The IndexWriter class supports an optimize() method that compacts the index database and speedup queries. You may want to use this method after performing a complete indexing of your document set or after incremental updates of the index. If your incremental update adds documents frequently, you want to perform the optimization only once in a while to avoid the extra overhead of the optimization."
> PATCH  INFO:
> added configuration option + a function  "needToOptimize()" which is called before optimizing.
> needToOptimize() uses a random function generator, to keep code simple.
> - when the option is not set, CODE WILL BE EXECUTED AS BEFORE
> - tested one 2.1.11 SVN branch, but no differences in the "main" trunk thus can be applied there also.
> - Updated API docs
> - if patch accepted, I will also update the Wiki:
> http://wiki.apache.org/cocoon/LuceneIndexTransformer

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Re: [jira] Reopened: (COCOON-2065) huge performance increase of LuceneIndexTransformer on large Lucene indexes

Posted by Felix Knecht <fe...@apache.org>.
> It seems to be bad luck that it always hits you ;-)

I'm used to this - somehow I attract those kind of problems. So please
be patient with me I'm sure to fall into the next trap pretty soon ;-)

Felix

Re: [jira] Reopened: (COCOON-2065) huge performance increase of LuceneIndexTransformer on large Lucene indexes

Posted by Joerg Heinicke <jo...@gmx.de>.
On 24.07.2007 00:42, Felix Knecht wrote:

> I don't feel bad, the main question was if I can trust in the 'fixed'
> tags in jira (2.1.11-dev was marked as fixed) or if I need to check in
> the code if really all (even the already marked as fixed version)
> mentioned branches are fixed before closing.

Don't bother too much about such things - it just happens and others 
should recheck as well as Antonio did now. There are no exact 
step-by-step procedures what to check.

What was suspicious in this case for example was that you were the first 
assignee though the fix version was already set (see "change history" 
tab in Jira issue). Also in the "Subversion commits" tab there is 
(still) nothing to see despite we use to put the issue number (the exact 
one, in this case COCOON-2065) into the commit message. Jira checks 
those messages (don't know how exactly it is integrated) and puts them 
on that tab.

So the fix version seems to have been set since the creation of the 
issue, probably the creator thought this is appropriate since he came up 
with a patch. Unfortunately and obviously that's wrong and everybody 
could have noticed that from the beginning and could have corrected it 
in the issue. Just do the insanity checks you think are meaningful. It 
seems to be bad luck that it always hits you ;-)

Regards
Joerg

Re: [jira] Reopened: (COCOON-2065) huge performance increase of LuceneIndexTransformer on large Lucene indexes

Posted by Felix Knecht <fe...@otego.com>.
Antonio Gallardo wrote:
> Hi Felix,
> 
> You should not feel bad for that. Currently we have only 2 branches:
> 
> 2.2-dev and 2.1.11-dev. The patch was applied to 2.2-dev but not to
> 2.1.11-dev, this is why I reopened the issue.

I don't feel bad, the main question was if I can trust in the 'fixed'
tags in jira (2.1.11-dev was marked as fixed) or if I need to check in
the code if really all (even the already marked as fixed version)
mentioned branches are fixed before closing.

Regards
Felix

Re: [jira] Reopened: (COCOON-2065) huge performance increase of LuceneIndexTransformer on large Lucene indexes

Posted by Antonio Gallardo <ag...@agssa.net>.
Hi Felix,

You should not feel bad for that. Currently we have only 2 branches:

2.2-dev and 2.1.11-dev. The patch was applied to 2.2-dev but not to 
2.1.11-dev, this is why I reopened the issue.

Best Regards,

Antonio Gallardo.


Felix Knecht escribió:
> Antonio Gallardo (JIRA) wrote:
>   
>>      [ https://issues.apache.org/jira/browse/COCOON-2065?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
>>
>> Antonio Gallardo reopened COCOON-2065:
>> --------------------------------------
>>
>>
>> Patch was not applied in cocon 2.1.11-dev.
>>
>>     
>
> I really fell sorry about this, but I trusted the tag that in 2.1.11-dev
> it was already fixed.
> Am I forced to verify that all version marked as fixed are really fixed
> before closing the issue?
>
> I relied on the versions mark as fixed (which was the case for 2.1.11-dev).
>
> Felix
>   


Re: [jira] Reopened: (COCOON-2065) huge performance increase of LuceneIndexTransformer on large Lucene indexes

Posted by Felix Knecht <fe...@otego.com>.
Antonio Gallardo (JIRA) wrote:
>      [ https://issues.apache.org/jira/browse/COCOON-2065?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
> 
> Antonio Gallardo reopened COCOON-2065:
> --------------------------------------
> 
> 
> Patch was not applied in cocon 2.1.11-dev.
> 

I really fell sorry about this, but I trusted the tag that in 2.1.11-dev
it was already fixed.
Am I forced to verify that all version marked as fixed are really fixed
before closing the issue?

I relied on the versions mark as fixed (which was the case for 2.1.11-dev).

Felix