You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@lucene.apache.org by "Shai Erera (JIRA)" <ji...@apache.org> on 2010/12/02 13:32:10 UTC

[jira] Created: (LUCENE-2790) IndexWriter should call MP.useCompoundFile and not LogMP.getUseCompoundFile

IndexWriter should call MP.useCompoundFile and not LogMP.getUseCompoundFile
---------------------------------------------------------------------------

                 Key: LUCENE-2790
                 URL: https://issues.apache.org/jira/browse/LUCENE-2790
             Project: Lucene - Java
          Issue Type: Improvement
          Components: Index
            Reporter: Shai Erera
            Assignee: Shai Erera
            Priority: Minor
             Fix For: 3.1, 4.0


Spin off from here: http://www.gossamer-threads.com/lists/lucene/java-dev/112311.

I will attach a patch shortly that addresses the issue on trunk.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


[jira] Commented: (LUCENE-2790) IndexWriter should call MP.useCompoundFile and not LogMP.getUseCompoundFile

Posted by "Earwin Burrfoot (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/LUCENE-2790?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12966550#action_12966550 ] 

Earwin Burrfoot commented on LUCENE-2790:
-----------------------------------------

Ok, let's commit?

There's no need to force first few commits to CFS. CFS' sole purporse is to keep number of simultaneously open files low. Not likely you gonna see frightening numbers with only a pair of segments in index.
Later these segments are merged (and probably CFSed), so no worries.

> IndexWriter should call MP.useCompoundFile and not LogMP.getUseCompoundFile
> ---------------------------------------------------------------------------
>
>                 Key: LUCENE-2790
>                 URL: https://issues.apache.org/jira/browse/LUCENE-2790
>             Project: Lucene - Java
>          Issue Type: Improvement
>          Components: Index
>            Reporter: Shai Erera
>            Assignee: Shai Erera
>            Priority: Minor
>             Fix For: 3.1, 4.0
>
>         Attachments: LUCENE-2790.patch, LUCENE-2790.patch, LUCENE-2790.patch, LUCENE-2790.patch, LUCENE-2790.patch, LUCENE-2790.patch, LUCENE-2790.patch
>
>
> Spin off from here: http://www.gossamer-threads.com/lists/lucene/java-dev/112311.
> I will attach a patch shortly that addresses the issue on trunk.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


[jira] Commented: (LUCENE-2790) IndexWriter should call MP.useCompoundFile and not LogMP.getUseCompoundFile

Posted by "Shai Erera (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/LUCENE-2790?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12966110#action_12966110 ] 

Shai Erera commented on LUCENE-2790:
------------------------------------

test-core passed for me before I uploaded the patch. Can you please post here the 'ant test' command that reproduces it?

I checked who implements useCompoundFile and all I find is LogMP and NoMP, both don't iterate on the SegmentInfos. What MP did you test with?

Anyway, need to take a closer look at that. So if you can paste here the 'ant test' that reproduces it, it'd be great.

> IndexWriter should call MP.useCompoundFile and not LogMP.getUseCompoundFile
> ---------------------------------------------------------------------------
>
>                 Key: LUCENE-2790
>                 URL: https://issues.apache.org/jira/browse/LUCENE-2790
>             Project: Lucene - Java
>          Issue Type: Improvement
>          Components: Index
>            Reporter: Shai Erera
>            Assignee: Shai Erera
>            Priority: Minor
>             Fix For: 3.1, 4.0
>
>         Attachments: LUCENE-2790.patch, LUCENE-2790.patch
>
>
> Spin off from here: http://www.gossamer-threads.com/lists/lucene/java-dev/112311.
> I will attach a patch shortly that addresses the issue on trunk.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


[jira] Updated: (LUCENE-2790) IndexWriter should call MP.useCompoundFile and not LogMP.getUseCompoundFile

Posted by "Earwin Burrfoot (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/LUCENE-2790?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Earwin Burrfoot updated LUCENE-2790:
------------------------------------

    Attachment: LUCENE-2790.patch

Okay, this patch fixes remaining threading issue in IW.mergeMiddle,
and three tests that were expecting CFS segments and weren't getting ones
due to flush now respecting noCFSRatio and noCFSRatio default of 0.1

> IndexWriter should call MP.useCompoundFile and not LogMP.getUseCompoundFile
> ---------------------------------------------------------------------------
>
>                 Key: LUCENE-2790
>                 URL: https://issues.apache.org/jira/browse/LUCENE-2790
>             Project: Lucene - Java
>          Issue Type: Improvement
>          Components: Index
>            Reporter: Shai Erera
>            Assignee: Shai Erera
>            Priority: Minor
>             Fix For: 3.1, 4.0
>
>         Attachments: LUCENE-2790.patch, LUCENE-2790.patch, LUCENE-2790.patch, LUCENE-2790.patch
>
>
> Spin off from here: http://www.gossamer-threads.com/lists/lucene/java-dev/112311.
> I will attach a patch shortly that addresses the issue on trunk.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


[jira] Commented: (LUCENE-2790) IndexWriter should call MP.useCompoundFile and not LogMP.getUseCompoundFile

Posted by "Michael McCandless (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/LUCENE-2790?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12966557#action_12966557 ] 

Michael McCandless commented on LUCENE-2790:
--------------------------------------------

bq. Ok, let's commit?

+1

> IndexWriter should call MP.useCompoundFile and not LogMP.getUseCompoundFile
> ---------------------------------------------------------------------------
>
>                 Key: LUCENE-2790
>                 URL: https://issues.apache.org/jira/browse/LUCENE-2790
>             Project: Lucene - Java
>          Issue Type: Improvement
>          Components: Index
>            Reporter: Shai Erera
>            Assignee: Shai Erera
>            Priority: Minor
>             Fix For: 3.1, 4.0
>
>         Attachments: LUCENE-2790.patch, LUCENE-2790.patch, LUCENE-2790.patch, LUCENE-2790.patch, LUCENE-2790.patch, LUCENE-2790.patch, LUCENE-2790.patch
>
>
> Spin off from here: http://www.gossamer-threads.com/lists/lucene/java-dev/112311.
> I will attach a patch shortly that addresses the issue on trunk.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


[jira] Updated: (LUCENE-2790) IndexWriter should call MP.useCompoundFile and not LogMP.getUseCompoundFile

Posted by "Shai Erera (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/LUCENE-2790?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Shai Erera updated LUCENE-2790:
-------------------------------

    Attachment: LUCENE-2790.patch

Patch fixes the threading issue Earwin reported, by checking whether to create the CFS in a sync block. Also, after discussing this on IRC, the code is further simplified by creating the compound file before the new segment is committed.

However, some tests still fail on ConcurrentModException. I cannot debug it now, so am posting the patch in case someone wants to take a stab. I can continue later. To reproduce the failure:

ant test -Dtestcase=TestIndexWriter -Dtestmethod=testDeleteUnusedFiles -Dtests.seed=-1861905402886420424:-8896948763797565454 -Dtests.codec=randomPerField



> IndexWriter should call MP.useCompoundFile and not LogMP.getUseCompoundFile
> ---------------------------------------------------------------------------
>
>                 Key: LUCENE-2790
>                 URL: https://issues.apache.org/jira/browse/LUCENE-2790
>             Project: Lucene - Java
>          Issue Type: Improvement
>          Components: Index
>            Reporter: Shai Erera
>            Assignee: Shai Erera
>            Priority: Minor
>             Fix For: 3.1, 4.0
>
>         Attachments: LUCENE-2790.patch, LUCENE-2790.patch, LUCENE-2790.patch
>
>
> Spin off from here: http://www.gossamer-threads.com/lists/lucene/java-dev/112311.
> I will attach a patch shortly that addresses the issue on trunk.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


[jira] Issue Comment Edited: (LUCENE-2790) IndexWriter should call MP.useCompoundFile and not LogMP.getUseCompoundFile

Posted by "Uwe Schindler (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/LUCENE-2790?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12967183#action_12967183 ] 

Uwe Schindler edited comment on LUCENE-2790 at 12/6/10 10:54 AM:
-----------------------------------------------------------------

I would simply disable the tests. Reflection should only be used when mock classes are used that affect thousands of tests. There are already lots of tests disabled.

      was (Author: thetaphi):
    I would supply disable the tests. Reflection should only be used when mock classes are used that affect thousands of tests. There are already lots of tests disabled.
  
> IndexWriter should call MP.useCompoundFile and not LogMP.getUseCompoundFile
> ---------------------------------------------------------------------------
>
>                 Key: LUCENE-2790
>                 URL: https://issues.apache.org/jira/browse/LUCENE-2790
>             Project: Lucene - Java
>          Issue Type: Improvement
>          Components: Index
>            Reporter: Shai Erera
>            Assignee: Shai Erera
>            Priority: Minor
>             Fix For: 3.1, 4.0
>
>         Attachments: LUCENE-2790-3x.patch, LUCENE-2790.patch, LUCENE-2790.patch, LUCENE-2790.patch, LUCENE-2790.patch, LUCENE-2790.patch, LUCENE-2790.patch, LUCENE-2790.patch
>
>
> Spin off from here: http://www.gossamer-threads.com/lists/lucene/java-dev/112311.
> I will attach a patch shortly that addresses the issue on trunk.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


[jira] Commented: (LUCENE-2790) IndexWriter should call MP.useCompoundFile and not LogMP.getUseCompoundFile

Posted by "Michael McCandless (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/LUCENE-2790?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12966167#action_12966167 ] 

Michael McCandless commented on LUCENE-2790:
--------------------------------------------

Patch looks great!

My only concern is... it looks like addIndexes(IR[]), with compound file used in the end, may fail to delete the non-compound files once the SegmentInfo is committed?  Maybe we should add a test to show the failure...

I think we need to do something like this:
{noformat}
          // delete new non cfs files directly: they were never
          // registered with IFD
          deleter.deleteNewFiles(merger.getMergedFiles(merge.info));
{noformat}

> IndexWriter should call MP.useCompoundFile and not LogMP.getUseCompoundFile
> ---------------------------------------------------------------------------
>
>                 Key: LUCENE-2790
>                 URL: https://issues.apache.org/jira/browse/LUCENE-2790
>             Project: Lucene - Java
>          Issue Type: Improvement
>          Components: Index
>            Reporter: Shai Erera
>            Assignee: Shai Erera
>            Priority: Minor
>             Fix For: 3.1, 4.0
>
>         Attachments: LUCENE-2790.patch, LUCENE-2790.patch, LUCENE-2790.patch, LUCENE-2790.patch
>
>
> Spin off from here: http://www.gossamer-threads.com/lists/lucene/java-dev/112311.
> I will attach a patch shortly that addresses the issue on trunk.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


[jira] Issue Comment Edited: (LUCENE-2790) IndexWriter should call MP.useCompoundFile and not LogMP.getUseCompoundFile

Posted by "Earwin Burrfoot (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/LUCENE-2790?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12966108#action_12966108 ] 

Earwin Burrfoot edited comment on LUCENE-2790 at 12/2/10 8:12 AM:
------------------------------------------------------------------

Check this patch out.
It changes useCompoundFile(SIS, SI) to respect noCFSRatio and drops useCompoundFile from OneMerge, so all decisions about using compound files now happen in a single place.
It also highlights the problem with your patch - when calling useCompoundFile from addIndexes, you should hold a lock, so segmentInfos won't be modified while mergePolicy inspects them.

      was (Author: earwin):
    Check this patch out.
It moves noCFS ratio to useCompoundFile(SIS, SI) and drops useCompoundFile from OneMerge, so all decisions about using compound files now happen in a single place.
It also highlights the problem with your patch - when calling useCompoundFile from addIndexes, you should hold a lock, so segmentInfos won't be modified while mergePolicy inspects them.
  
> IndexWriter should call MP.useCompoundFile and not LogMP.getUseCompoundFile
> ---------------------------------------------------------------------------
>
>                 Key: LUCENE-2790
>                 URL: https://issues.apache.org/jira/browse/LUCENE-2790
>             Project: Lucene - Java
>          Issue Type: Improvement
>          Components: Index
>            Reporter: Shai Erera
>            Assignee: Shai Erera
>            Priority: Minor
>             Fix For: 3.1, 4.0
>
>         Attachments: LUCENE-2790.patch, LUCENE-2790.patch
>
>
> Spin off from here: http://www.gossamer-threads.com/lists/lucene/java-dev/112311.
> I will attach a patch shortly that addresses the issue on trunk.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


[jira] Commented: (LUCENE-2790) IndexWriter should call MP.useCompoundFile and not LogMP.getUseCompoundFile

Posted by "Shai Erera (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/LUCENE-2790?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12967127#action_12967127 ] 

Shai Erera commented on LUCENE-2790:
------------------------------------

bq. How Lucene manages the index files is under-the-hood so we are free to change it.

That's correct. However, sadly, the backwards tests do not agree with you :). Because the runtime behavior has changed, the tests fail. If you try to call LMP.setNoCFSRation, you get a NoSuchMethodError because the tests are compiled against 3.0's source, where indeed it does not exist.

I'm trying to resolve it by fetching the method using reflection, but this shows another problem w/ how we maintain the backwards tests.

> IndexWriter should call MP.useCompoundFile and not LogMP.getUseCompoundFile
> ---------------------------------------------------------------------------
>
>                 Key: LUCENE-2790
>                 URL: https://issues.apache.org/jira/browse/LUCENE-2790
>             Project: Lucene - Java
>          Issue Type: Improvement
>          Components: Index
>            Reporter: Shai Erera
>            Assignee: Shai Erera
>            Priority: Minor
>             Fix For: 3.1, 4.0
>
>         Attachments: LUCENE-2790.patch, LUCENE-2790.patch, LUCENE-2790.patch, LUCENE-2790.patch, LUCENE-2790.patch, LUCENE-2790.patch, LUCENE-2790.patch
>
>
> Spin off from here: http://www.gossamer-threads.com/lists/lucene/java-dev/112311.
> I will attach a patch shortly that addresses the issue on trunk.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


[jira] Updated: (LUCENE-2790) IndexWriter should call MP.useCompoundFile and not LogMP.getUseCompoundFile

Posted by "Earwin Burrfoot (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/LUCENE-2790?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Earwin Burrfoot updated LUCENE-2790:
------------------------------------

    Attachment: LUCENE-2790.patch

Check this patch out.
It moves noCFS ratio to useCompoundFile(SIS, SI) and drops useCompoundFile from OneMerge, so all decisions about using compound files now happen in a single place.
It also highlights the problem with your patch - when calling useCompoundFile from addIndexes, you should hold a lock, so segmentInfos won't be modified while mergePolicy inspects them.

> IndexWriter should call MP.useCompoundFile and not LogMP.getUseCompoundFile
> ---------------------------------------------------------------------------
>
>                 Key: LUCENE-2790
>                 URL: https://issues.apache.org/jira/browse/LUCENE-2790
>             Project: Lucene - Java
>          Issue Type: Improvement
>          Components: Index
>            Reporter: Shai Erera
>            Assignee: Shai Erera
>            Priority: Minor
>             Fix For: 3.1, 4.0
>
>         Attachments: LUCENE-2790.patch, LUCENE-2790.patch
>
>
> Spin off from here: http://www.gossamer-threads.com/lists/lucene/java-dev/112311.
> I will attach a patch shortly that addresses the issue on trunk.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


[jira] Commented: (LUCENE-2790) IndexWriter should call MP.useCompoundFile and not LogMP.getUseCompoundFile

Posted by "Earwin Burrfoot (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/LUCENE-2790?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12966285#action_12966285 ] 

Earwin Burrfoot commented on LUCENE-2790:
-----------------------------------------

Shai, what about:
bq. My only concern is... it looks like addIndexes(IR[]), with compound file used in the end, may fail to delete the non-compound files once the SegmentInfo is committed?
I fixed everything else, but can't answer this question.

> IndexWriter should call MP.useCompoundFile and not LogMP.getUseCompoundFile
> ---------------------------------------------------------------------------
>
>                 Key: LUCENE-2790
>                 URL: https://issues.apache.org/jira/browse/LUCENE-2790
>             Project: Lucene - Java
>          Issue Type: Improvement
>          Components: Index
>            Reporter: Shai Erera
>            Assignee: Shai Erera
>            Priority: Minor
>             Fix For: 3.1, 4.0
>
>         Attachments: LUCENE-2790.patch, LUCENE-2790.patch, LUCENE-2790.patch, LUCENE-2790.patch, LUCENE-2790.patch
>
>
> Spin off from here: http://www.gossamer-threads.com/lists/lucene/java-dev/112311.
> I will attach a patch shortly that addresses the issue on trunk.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


[jira] Commented: (LUCENE-2790) IndexWriter should call MP.useCompoundFile and not LogMP.getUseCompoundFile

Posted by "Michael McCandless (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/LUCENE-2790?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12966815#action_12966815 ] 

Michael McCandless commented on LUCENE-2790:
--------------------------------------------

I think we should document the change to LMP.useCompoundFile?

But: I don't consider this a backwards break.

How Lucene manages the index files is under-the-hood so we are free to change it.

> IndexWriter should call MP.useCompoundFile and not LogMP.getUseCompoundFile
> ---------------------------------------------------------------------------
>
>                 Key: LUCENE-2790
>                 URL: https://issues.apache.org/jira/browse/LUCENE-2790
>             Project: Lucene - Java
>          Issue Type: Improvement
>          Components: Index
>            Reporter: Shai Erera
>            Assignee: Shai Erera
>            Priority: Minor
>             Fix For: 3.1, 4.0
>
>         Attachments: LUCENE-2790.patch, LUCENE-2790.patch, LUCENE-2790.patch, LUCENE-2790.patch, LUCENE-2790.patch, LUCENE-2790.patch, LUCENE-2790.patch
>
>
> Spin off from here: http://www.gossamer-threads.com/lists/lucene/java-dev/112311.
> I will attach a patch shortly that addresses the issue on trunk.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


[jira] Commented: (LUCENE-2790) IndexWriter should call MP.useCompoundFile and not LogMP.getUseCompoundFile

Posted by "Earwin Burrfoot (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/LUCENE-2790?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12966103#action_12966103 ] 

Earwin Burrfoot commented on LUCENE-2790:
-----------------------------------------

Fails addIndexesWithThreads with ConcurrentModificationException, if MergePolicy actually tries to iterate infos passed to useCompoundFile(SIS, SI).

> IndexWriter should call MP.useCompoundFile and not LogMP.getUseCompoundFile
> ---------------------------------------------------------------------------
>
>                 Key: LUCENE-2790
>                 URL: https://issues.apache.org/jira/browse/LUCENE-2790
>             Project: Lucene - Java
>          Issue Type: Improvement
>          Components: Index
>            Reporter: Shai Erera
>            Assignee: Shai Erera
>            Priority: Minor
>             Fix For: 3.1, 4.0
>
>         Attachments: LUCENE-2790.patch
>
>
> Spin off from here: http://www.gossamer-threads.com/lists/lucene/java-dev/112311.
> I will attach a patch shortly that addresses the issue on trunk.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


[jira] Resolved: (LUCENE-2790) IndexWriter should call MP.useCompoundFile and not LogMP.getUseCompoundFile

Posted by "Shai Erera (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/LUCENE-2790?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Shai Erera resolved LUCENE-2790.
--------------------------------

    Resolution: Fixed

Committed revision 1042948 (3x)

I decided to keep the reflection hack for now, until we come up w/ a better decision. One of the tests which had to be fixed is TestBackwardsCompatibility which needs to be in backwards and I don't think we can delete it, even if it's tested by 'core' as well.

Thanks Earwin and others for your comments and help !

> IndexWriter should call MP.useCompoundFile and not LogMP.getUseCompoundFile
> ---------------------------------------------------------------------------
>
>                 Key: LUCENE-2790
>                 URL: https://issues.apache.org/jira/browse/LUCENE-2790
>             Project: Lucene - Java
>          Issue Type: Improvement
>          Components: Index
>            Reporter: Shai Erera
>            Assignee: Shai Erera
>            Priority: Minor
>             Fix For: 3.1, 4.0
>
>         Attachments: LUCENE-2790-3x.patch, LUCENE-2790.patch, LUCENE-2790.patch, LUCENE-2790.patch, LUCENE-2790.patch, LUCENE-2790.patch, LUCENE-2790.patch, LUCENE-2790.patch
>
>
> Spin off from here: http://www.gossamer-threads.com/lists/lucene/java-dev/112311.
> I will attach a patch shortly that addresses the issue on trunk.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


[jira] Updated: (LUCENE-2790) IndexWriter should call MP.useCompoundFile and not LogMP.getUseCompoundFile

Posted by "Earwin Burrfoot (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/LUCENE-2790?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Earwin Burrfoot updated LUCENE-2790:
------------------------------------

    Attachment: LUCENE-2790.patch

Fixed your test failure

> IndexWriter should call MP.useCompoundFile and not LogMP.getUseCompoundFile
> ---------------------------------------------------------------------------
>
>                 Key: LUCENE-2790
>                 URL: https://issues.apache.org/jira/browse/LUCENE-2790
>             Project: Lucene - Java
>          Issue Type: Improvement
>          Components: Index
>            Reporter: Shai Erera
>            Assignee: Shai Erera
>            Priority: Minor
>             Fix For: 3.1, 4.0
>
>         Attachments: LUCENE-2790.patch, LUCENE-2790.patch, LUCENE-2790.patch, LUCENE-2790.patch, LUCENE-2790.patch
>
>
> Spin off from here: http://www.gossamer-threads.com/lists/lucene/java-dev/112311.
> I will attach a patch shortly that addresses the issue on trunk.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


[jira] Commented: (LUCENE-2790) IndexWriter should call MP.useCompoundFile and not LogMP.getUseCompoundFile

Posted by "Michael McCandless (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/LUCENE-2790?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12966178#action_12966178 ] 

Michael McCandless commented on LUCENE-2790:
--------------------------------------------

Hmm... something is amiss.  I hit this failure:
{noformat}
ant test -Dtestcase=TestIndexSplitter -Dtestmethod=test -Dtests.seed=5299033587626573117:-25334708766924714 -Dtests.codec=randomPerField
{noformat}

But it passes on trunk...

> IndexWriter should call MP.useCompoundFile and not LogMP.getUseCompoundFile
> ---------------------------------------------------------------------------
>
>                 Key: LUCENE-2790
>                 URL: https://issues.apache.org/jira/browse/LUCENE-2790
>             Project: Lucene - Java
>          Issue Type: Improvement
>          Components: Index
>            Reporter: Shai Erera
>            Assignee: Shai Erera
>            Priority: Minor
>             Fix For: 3.1, 4.0
>
>         Attachments: LUCENE-2790.patch, LUCENE-2790.patch, LUCENE-2790.patch, LUCENE-2790.patch
>
>
> Spin off from here: http://www.gossamer-threads.com/lists/lucene/java-dev/112311.
> I will attach a patch shortly that addresses the issue on trunk.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


[jira] Updated: (LUCENE-2790) IndexWriter should call MP.useCompoundFile and not LogMP.getUseCompoundFile

Posted by "Shai Erera (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/LUCENE-2790?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Shai Erera updated LUCENE-2790:
-------------------------------

    Attachment: LUCENE-2790-3x.patch

Backport to 3x. Note the reflection hack I had to use to make the backwards tests run. I don't commit yet - waiting for some response about the backwards tests. If you're ok with it, I'll commit.

> IndexWriter should call MP.useCompoundFile and not LogMP.getUseCompoundFile
> ---------------------------------------------------------------------------
>
>                 Key: LUCENE-2790
>                 URL: https://issues.apache.org/jira/browse/LUCENE-2790
>             Project: Lucene - Java
>          Issue Type: Improvement
>          Components: Index
>            Reporter: Shai Erera
>            Assignee: Shai Erera
>            Priority: Minor
>             Fix For: 3.1, 4.0
>
>         Attachments: LUCENE-2790-3x.patch, LUCENE-2790.patch, LUCENE-2790.patch, LUCENE-2790.patch, LUCENE-2790.patch, LUCENE-2790.patch, LUCENE-2790.patch, LUCENE-2790.patch
>
>
> Spin off from here: http://www.gossamer-threads.com/lists/lucene/java-dev/112311.
> I will attach a patch shortly that addresses the issue on trunk.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


[jira] Commented: (LUCENE-2790) IndexWriter should call MP.useCompoundFile and not LogMP.getUseCompoundFile

Posted by "Earwin Burrfoot (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/LUCENE-2790?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12966112#action_12966112 ] 

Earwin Burrfoot commented on LUCENE-2790:
-----------------------------------------

bq. I checked who implements useCompoundFile and all I find is LogMP and NoMP, both don't iterate on the SegmentInfos. What MP did you test with?
Apply my patch, it changes LogMP to use SegmentInfos.

bq. So if you can paste here the 'ant test' that reproduces it, it'd be great.
ant test -Dtestcase=TestAddIndexes -Dtestmethod=testAddIndexesWithThreads -Dtests.seed=5369960668186287821:331425426639083833 -Dtests.codec=randomPerField
The test is threaded, so it doesn't fail always.

> IndexWriter should call MP.useCompoundFile and not LogMP.getUseCompoundFile
> ---------------------------------------------------------------------------
>
>                 Key: LUCENE-2790
>                 URL: https://issues.apache.org/jira/browse/LUCENE-2790
>             Project: Lucene - Java
>          Issue Type: Improvement
>          Components: Index
>            Reporter: Shai Erera
>            Assignee: Shai Erera
>            Priority: Minor
>             Fix For: 3.1, 4.0
>
>         Attachments: LUCENE-2790.patch, LUCENE-2790.patch
>
>
> Spin off from here: http://www.gossamer-threads.com/lists/lucene/java-dev/112311.
> I will attach a patch shortly that addresses the issue on trunk.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


[jira] Commented: (LUCENE-2790) IndexWriter should call MP.useCompoundFile and not LogMP.getUseCompoundFile

Posted by "Shai Erera (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/LUCENE-2790?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12966248#action_12966248 ] 

Shai Erera commented on LUCENE-2790:
------------------------------------

Patch looks good. All tests pass for me. Let's give it a couple more tries, to allow for random tests to catch us. It'd be good if you can try running them too.

> IndexWriter should call MP.useCompoundFile and not LogMP.getUseCompoundFile
> ---------------------------------------------------------------------------
>
>                 Key: LUCENE-2790
>                 URL: https://issues.apache.org/jira/browse/LUCENE-2790
>             Project: Lucene - Java
>          Issue Type: Improvement
>          Components: Index
>            Reporter: Shai Erera
>            Assignee: Shai Erera
>            Priority: Minor
>             Fix For: 3.1, 4.0
>
>         Attachments: LUCENE-2790.patch, LUCENE-2790.patch, LUCENE-2790.patch, LUCENE-2790.patch, LUCENE-2790.patch
>
>
> Spin off from here: http://www.gossamer-threads.com/lists/lucene/java-dev/112311.
> I will attach a patch shortly that addresses the issue on trunk.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


[jira] Commented: (LUCENE-2790) IndexWriter should call MP.useCompoundFile and not LogMP.getUseCompoundFile

Posted by "Shai Erera (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/LUCENE-2790?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12967208#action_12967208 ] 

Shai Erera commented on LUCENE-2790:
------------------------------------

I don't mind disabling the tests, but I think we should discuss the bigger issue (on that thread on the mailing list). If we decide to make it a 'policy' to disable backwards tests that break due to legal changes to the API and behavior, let's at least reach a consensus.

> IndexWriter should call MP.useCompoundFile and not LogMP.getUseCompoundFile
> ---------------------------------------------------------------------------
>
>                 Key: LUCENE-2790
>                 URL: https://issues.apache.org/jira/browse/LUCENE-2790
>             Project: Lucene - Java
>          Issue Type: Improvement
>          Components: Index
>            Reporter: Shai Erera
>            Assignee: Shai Erera
>            Priority: Minor
>             Fix For: 3.1, 4.0
>
>         Attachments: LUCENE-2790-3x.patch, LUCENE-2790.patch, LUCENE-2790.patch, LUCENE-2790.patch, LUCENE-2790.patch, LUCENE-2790.patch, LUCENE-2790.patch, LUCENE-2790.patch
>
>
> Spin off from here: http://www.gossamer-threads.com/lists/lucene/java-dev/112311.
> I will attach a patch shortly that addresses the issue on trunk.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


[jira] Updated: (LUCENE-2790) IndexWriter should call MP.useCompoundFile and not LogMP.getUseCompoundFile

Posted by "Shai Erera (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/LUCENE-2790?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Shai Erera updated LUCENE-2790:
-------------------------------

    Attachment: LUCENE-2790.patch

Same patch, only uses MockAnalyzer and not WhitespaceAnalyzer (which failed compilation from command line).

> IndexWriter should call MP.useCompoundFile and not LogMP.getUseCompoundFile
> ---------------------------------------------------------------------------
>
>                 Key: LUCENE-2790
>                 URL: https://issues.apache.org/jira/browse/LUCENE-2790
>             Project: Lucene - Java
>          Issue Type: Improvement
>          Components: Index
>            Reporter: Shai Erera
>            Assignee: Shai Erera
>            Priority: Minor
>             Fix For: 3.1, 4.0
>
>         Attachments: LUCENE-2790.patch, LUCENE-2790.patch, LUCENE-2790.patch, LUCENE-2790.patch, LUCENE-2790.patch, LUCENE-2790.patch, LUCENE-2790.patch
>
>
> Spin off from here: http://www.gossamer-threads.com/lists/lucene/java-dev/112311.
> I will attach a patch shortly that addresses the issue on trunk.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


[jira] Commented: (LUCENE-2790) IndexWriter should call MP.useCompoundFile and not LogMP.getUseCompoundFile

Posted by "Uwe Schindler (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/LUCENE-2790?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12967183#action_12967183 ] 

Uwe Schindler commented on LUCENE-2790:
---------------------------------------

I would supply disable the tests. Reflection should only be used when mock classes are used that affect thousands of tests. There are already lots of tests disabled.

> IndexWriter should call MP.useCompoundFile and not LogMP.getUseCompoundFile
> ---------------------------------------------------------------------------
>
>                 Key: LUCENE-2790
>                 URL: https://issues.apache.org/jira/browse/LUCENE-2790
>             Project: Lucene - Java
>          Issue Type: Improvement
>          Components: Index
>            Reporter: Shai Erera
>            Assignee: Shai Erera
>            Priority: Minor
>             Fix For: 3.1, 4.0
>
>         Attachments: LUCENE-2790-3x.patch, LUCENE-2790.patch, LUCENE-2790.patch, LUCENE-2790.patch, LUCENE-2790.patch, LUCENE-2790.patch, LUCENE-2790.patch, LUCENE-2790.patch
>
>
> Spin off from here: http://www.gossamer-threads.com/lists/lucene/java-dev/112311.
> I will attach a patch shortly that addresses the issue on trunk.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


[jira] Updated: (LUCENE-2790) IndexWriter should call MP.useCompoundFile and not LogMP.getUseCompoundFile

Posted by "Shai Erera (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/LUCENE-2790?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Shai Erera updated LUCENE-2790:
-------------------------------

    Attachment: LUCENE-2790.patch

Patch applied on trunk. I took the opportunity to fix some minor Javadoc warnings as well.

> IndexWriter should call MP.useCompoundFile and not LogMP.getUseCompoundFile
> ---------------------------------------------------------------------------
>
>                 Key: LUCENE-2790
>                 URL: https://issues.apache.org/jira/browse/LUCENE-2790
>             Project: Lucene - Java
>          Issue Type: Improvement
>          Components: Index
>            Reporter: Shai Erera
>            Assignee: Shai Erera
>            Priority: Minor
>             Fix For: 3.1, 4.0
>
>         Attachments: LUCENE-2790.patch
>
>
> Spin off from here: http://www.gossamer-threads.com/lists/lucene/java-dev/112311.
> I will attach a patch shortly that addresses the issue on trunk.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


[jira] Commented: (LUCENE-2790) IndexWriter should call MP.useCompoundFile and not LogMP.getUseCompoundFile

Posted by "Shai Erera (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/LUCENE-2790?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12966788#action_12966788 ] 

Shai Erera commented on LUCENE-2790:
------------------------------------

Committed revision 1042101 to trunk.

I will back port to 3x if you agree this isn't a backwards break.

BTW, I did not add a CHANGES entry, because it's an internal optimization we've made to IndexWriter. Hmm .. maybe we should document the changes to LMP.useCompoundFile (that it now factors in the noCFSRatio)?

> IndexWriter should call MP.useCompoundFile and not LogMP.getUseCompoundFile
> ---------------------------------------------------------------------------
>
>                 Key: LUCENE-2790
>                 URL: https://issues.apache.org/jira/browse/LUCENE-2790
>             Project: Lucene - Java
>          Issue Type: Improvement
>          Components: Index
>            Reporter: Shai Erera
>            Assignee: Shai Erera
>            Priority: Minor
>             Fix For: 3.1, 4.0
>
>         Attachments: LUCENE-2790.patch, LUCENE-2790.patch, LUCENE-2790.patch, LUCENE-2790.patch, LUCENE-2790.patch, LUCENE-2790.patch, LUCENE-2790.patch
>
>
> Spin off from here: http://www.gossamer-threads.com/lists/lucene/java-dev/112311.
> I will attach a patch shortly that addresses the issue on trunk.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


[jira] Updated: (LUCENE-2790) IndexWriter should call MP.useCompoundFile and not LogMP.getUseCompoundFile

Posted by "Shai Erera (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/LUCENE-2790?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Shai Erera updated LUCENE-2790:
-------------------------------

    Attachment: LUCENE-2790.patch

Attached adds a test to TestAddIndexes w/ the fix as Mike proposed. The test fails w/o the fix and passes w/ it.

Also, I noticed that if I don't set noCFSRatio to 1.0, then the added segments are not converted to a CFS. That is because useCompoundFiles on LMP decides not to do that, because the size of the segment, which is 377 bytes, is more than 10% of the total index size, which is ... 0. I wonder if we should handle that case, or leave it as is - at some point, when more documents are added, that segment will be converted to a CFS.

I think that means that the first few segments that will be flushed will remain in non CFS format. I'm fine w/ it, just making sure I understand this right.

> IndexWriter should call MP.useCompoundFile and not LogMP.getUseCompoundFile
> ---------------------------------------------------------------------------
>
>                 Key: LUCENE-2790
>                 URL: https://issues.apache.org/jira/browse/LUCENE-2790
>             Project: Lucene - Java
>          Issue Type: Improvement
>          Components: Index
>            Reporter: Shai Erera
>            Assignee: Shai Erera
>            Priority: Minor
>             Fix For: 3.1, 4.0
>
>         Attachments: LUCENE-2790.patch, LUCENE-2790.patch, LUCENE-2790.patch, LUCENE-2790.patch, LUCENE-2790.patch, LUCENE-2790.patch
>
>
> Spin off from here: http://www.gossamer-threads.com/lists/lucene/java-dev/112311.
> I will attach a patch shortly that addresses the issue on trunk.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


[jira] Commented: (LUCENE-2790) IndexWriter should call MP.useCompoundFile and not LogMP.getUseCompoundFile

Posted by "Shai Erera (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/LUCENE-2790?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12966787#action_12966787 ] 

Shai Erera commented on LUCENE-2790:
------------------------------------

Do you see any back-compat issues w/ back-porting it to 3x? I'm thinking about the change in behavior of useCompoundFile in LMP which now factors is noCFSRatio. However, I see that noCFSRatio is in 3x's LMP and defaults to 0.1, which already changes behavior, so I think we can apply this change to 3x as well. What do you think?

> IndexWriter should call MP.useCompoundFile and not LogMP.getUseCompoundFile
> ---------------------------------------------------------------------------
>
>                 Key: LUCENE-2790
>                 URL: https://issues.apache.org/jira/browse/LUCENE-2790
>             Project: Lucene - Java
>          Issue Type: Improvement
>          Components: Index
>            Reporter: Shai Erera
>            Assignee: Shai Erera
>            Priority: Minor
>             Fix For: 3.1, 4.0
>
>         Attachments: LUCENE-2790.patch, LUCENE-2790.patch, LUCENE-2790.patch, LUCENE-2790.patch, LUCENE-2790.patch, LUCENE-2790.patch, LUCENE-2790.patch
>
>
> Spin off from here: http://www.gossamer-threads.com/lists/lucene/java-dev/112311.
> I will attach a patch shortly that addresses the issue on trunk.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org