You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@lucene.apache.org by "Uwe Schindler (Created) (JIRA)" <ji...@apache.org> on 2011/11/23 01:32:39 UTC

[jira] [Created] (LUCENE-3588) Try harder to prevent SIGSEGV on cloned MMapIndexInputs

Try harder to prevent SIGSEGV on cloned MMapIndexInputs
-------------------------------------------------------

                 Key: LUCENE-3588
                 URL: https://issues.apache.org/jira/browse/LUCENE-3588
             Project: Lucene - Java
          Issue Type: Improvement
          Components: core/store
    Affects Versions: 3.4, 3.5
            Reporter: Uwe Schindler
            Assignee: Uwe Schindler
             Fix For: 3.6, 4.0


We are unmapping mmapped byte buffers which is disallowed by the JDK, because it has the risk of SIGSEGV when you access the mapped byte buffer after unmapping.

We currently prevent this for the main IndexInput by setting its buffer to null, so we NPE if somebody tries to access the underlying buffer. I recently fixed also the stupid curBuf (LUCENE-3200) by setting to null.

The big problem are cloned IndexInputs which are generally not closed. Those still contain references to the unmapped ByteBuffer, which lead to SIGSEGV easily. The patch from Mike in LUCENE-3439 prevents most of this in Lucene 3.5, but its still not 100% safe (as it uses non-volatiles).

This patch will fix the remaining issues by also setting the buffers of clones to null when the original is closed. The trick is to record weak references of all clones created and close them together with the original. This uses a ConcurrentHashMap<WeakReference<MMapIndexInput>,?> as store with the logic borrowed from WeakHashMap to cleanup the GCed references (using ReferenceQueue).

If we respin 3.5, we should maybe also get this in.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


[jira] [Resolved] (LUCENE-3588) Try harder to prevent SIGSEGV on cloned MMapIndexInputs

Posted by "Uwe Schindler (Resolved) (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/LUCENE-3588?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Uwe Schindler resolved LUCENE-3588.
-----------------------------------

       Resolution: Fixed
    Lucene Fields: New,Patch Available  (was: New)

Committed trunk revision: 1205430
Committed 3.x revision: 1205434
                
> Try harder to prevent SIGSEGV on cloned MMapIndexInputs
> -------------------------------------------------------
>
>                 Key: LUCENE-3588
>                 URL: https://issues.apache.org/jira/browse/LUCENE-3588
>             Project: Lucene - Java
>          Issue Type: Improvement
>          Components: core/store
>    Affects Versions: 3.4, 3.5
>            Reporter: Uwe Schindler
>            Assignee: Uwe Schindler
>             Fix For: 3.6, 4.0
>
>         Attachments: LUCENE-3588-simpler.patch, LUCENE-3588-simpler.patch, LUCENE-3588-simpler.patch, LUCENE-3588.patch, LUCENE-3588.patch, LUCENE-3588.patch
>
>
> We are unmapping mmapped byte buffers which is disallowed by the JDK, because it has the risk of SIGSEGV when you access the mapped byte buffer after unmapping.
> We currently prevent this for the main IndexInput by setting its buffer to null, so we NPE if somebody tries to access the underlying buffer. I recently fixed also the stupid curBuf (LUCENE-3200) by setting to null.
> The big problem are cloned IndexInputs which are generally not closed. Those still contain references to the unmapped ByteBuffer, which lead to SIGSEGV easily. The patch from Mike in LUCENE-3439 prevents most of this in Lucene 3.5, but its still not 100% safe (as it uses non-volatiles).
> This patch will fix the remaining issues by also setting the buffers of clones to null when the original is closed. The trick is to record weak references of all clones created and close them together with the original. This uses a ConcurrentHashMap<WeakReference<MMapIndexInput>,?> as store with the logic borrowed from WeakHashMap to cleanup the GCed references (using ReferenceQueue).
> If we respin 3.5, we should maybe also get this in.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


[jira] [Updated] (LUCENE-3588) Try harder to prevent SIGSEGV on cloned MMapIndexInputs

Posted by "Uwe Schindler (Updated) (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/LUCENE-3588?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Uwe Schindler updated LUCENE-3588:
----------------------------------

    Attachment: LUCENE-3588.patch
    
> Try harder to prevent SIGSEGV on cloned MMapIndexInputs
> -------------------------------------------------------
>
>                 Key: LUCENE-3588
>                 URL: https://issues.apache.org/jira/browse/LUCENE-3588
>             Project: Lucene - Java
>          Issue Type: Improvement
>          Components: core/store
>    Affects Versions: 3.4, 3.5
>            Reporter: Uwe Schindler
>            Assignee: Uwe Schindler
>             Fix For: 3.6, 4.0
>
>         Attachments: LUCENE-3588.patch
>
>
> We are unmapping mmapped byte buffers which is disallowed by the JDK, because it has the risk of SIGSEGV when you access the mapped byte buffer after unmapping.
> We currently prevent this for the main IndexInput by setting its buffer to null, so we NPE if somebody tries to access the underlying buffer. I recently fixed also the stupid curBuf (LUCENE-3200) by setting to null.
> The big problem are cloned IndexInputs which are generally not closed. Those still contain references to the unmapped ByteBuffer, which lead to SIGSEGV easily. The patch from Mike in LUCENE-3439 prevents most of this in Lucene 3.5, but its still not 100% safe (as it uses non-volatiles).
> This patch will fix the remaining issues by also setting the buffers of clones to null when the original is closed. The trick is to record weak references of all clones created and close them together with the original. This uses a ConcurrentHashMap<WeakReference<MMapIndexInput>,?> as store with the logic borrowed from WeakHashMap to cleanup the GCed references (using ReferenceQueue).
> If we respin 3.5, we should maybe also get this in.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


[jira] [Commented] (LUCENE-3588) Try harder to prevent SIGSEGV on cloned MMapIndexInputs

Posted by "Dawid Weiss (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/LUCENE-3588?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13155822#comment-13155822 ] 

Dawid Weiss commented on LUCENE-3588:
-------------------------------------

I was thinking about this when looking at the code and I thought the intention of using CHM was to get an iterator that won' throw CME while iterating. If this isn't possible then you're right -- same thing to use a decorated WhateverMap.
                
> Try harder to prevent SIGSEGV on cloned MMapIndexInputs
> -------------------------------------------------------
>
>                 Key: LUCENE-3588
>                 URL: https://issues.apache.org/jira/browse/LUCENE-3588
>             Project: Lucene - Java
>          Issue Type: Improvement
>          Components: core/store
>    Affects Versions: 3.4, 3.5
>            Reporter: Uwe Schindler
>            Assignee: Uwe Schindler
>             Fix For: 3.6, 4.0
>
>         Attachments: LUCENE-3588-simpler.patch, LUCENE-3588.patch, LUCENE-3588.patch
>
>
> We are unmapping mmapped byte buffers which is disallowed by the JDK, because it has the risk of SIGSEGV when you access the mapped byte buffer after unmapping.
> We currently prevent this for the main IndexInput by setting its buffer to null, so we NPE if somebody tries to access the underlying buffer. I recently fixed also the stupid curBuf (LUCENE-3200) by setting to null.
> The big problem are cloned IndexInputs which are generally not closed. Those still contain references to the unmapped ByteBuffer, which lead to SIGSEGV easily. The patch from Mike in LUCENE-3439 prevents most of this in Lucene 3.5, but its still not 100% safe (as it uses non-volatiles).
> This patch will fix the remaining issues by also setting the buffers of clones to null when the original is closed. The trick is to record weak references of all clones created and close them together with the original. This uses a ConcurrentHashMap<WeakReference<MMapIndexInput>,?> as store with the logic borrowed from WeakHashMap to cleanup the GCed references (using ReferenceQueue).
> If we respin 3.5, we should maybe also get this in.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


[jira] [Commented] (LUCENE-3588) Try harder to prevent SIGSEGV on cloned MMapIndexInputs

Posted by "Doron Cohen (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/LUCENE-3588?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13155897#comment-13155897 ] 

Doron Cohen commented on LUCENE-3588:
-------------------------------------

Patch (last one) works well for me - the new test fails without the fix and passes with the fix.

It relies on shallow cloning of 'clones' - and so would break if WHM starts to implement Cloneable for some reason, but then the 'assert clone.clones == this.clones' in clone() guarantees early detection of this in the tests, cool.
                
> Try harder to prevent SIGSEGV on cloned MMapIndexInputs
> -------------------------------------------------------
>
>                 Key: LUCENE-3588
>                 URL: https://issues.apache.org/jira/browse/LUCENE-3588
>             Project: Lucene - Java
>          Issue Type: Improvement
>          Components: core/store
>    Affects Versions: 3.4, 3.5
>            Reporter: Uwe Schindler
>            Assignee: Uwe Schindler
>             Fix For: 3.6, 4.0
>
>         Attachments: LUCENE-3588-simpler.patch, LUCENE-3588-simpler.patch, LUCENE-3588-simpler.patch, LUCENE-3588.patch, LUCENE-3588.patch, LUCENE-3588.patch
>
>
> We are unmapping mmapped byte buffers which is disallowed by the JDK, because it has the risk of SIGSEGV when you access the mapped byte buffer after unmapping.
> We currently prevent this for the main IndexInput by setting its buffer to null, so we NPE if somebody tries to access the underlying buffer. I recently fixed also the stupid curBuf (LUCENE-3200) by setting to null.
> The big problem are cloned IndexInputs which are generally not closed. Those still contain references to the unmapped ByteBuffer, which lead to SIGSEGV easily. The patch from Mike in LUCENE-3439 prevents most of this in Lucene 3.5, but its still not 100% safe (as it uses non-volatiles).
> This patch will fix the remaining issues by also setting the buffers of clones to null when the original is closed. The trick is to record weak references of all clones created and close them together with the original. This uses a ConcurrentHashMap<WeakReference<MMapIndexInput>,?> as store with the logic borrowed from WeakHashMap to cleanup the GCed references (using ReferenceQueue).
> If we respin 3.5, we should maybe also get this in.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


[jira] [Commented] (LUCENE-3588) Try harder to prevent SIGSEGV on cloned MMapIndexInputs

Posted by "Uwe Schindler (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/LUCENE-3588?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13155826#comment-13155826 ] 

Uwe Schindler commented on LUCENE-3588:
---------------------------------------

WeakHashMap silently discards GCed references during iteration. And the close() method synchronized on the map, too.
                
> Try harder to prevent SIGSEGV on cloned MMapIndexInputs
> -------------------------------------------------------
>
>                 Key: LUCENE-3588
>                 URL: https://issues.apache.org/jira/browse/LUCENE-3588
>             Project: Lucene - Java
>          Issue Type: Improvement
>          Components: core/store
>    Affects Versions: 3.4, 3.5
>            Reporter: Uwe Schindler
>            Assignee: Uwe Schindler
>             Fix For: 3.6, 4.0
>
>         Attachments: LUCENE-3588-simpler.patch, LUCENE-3588.patch, LUCENE-3588.patch
>
>
> We are unmapping mmapped byte buffers which is disallowed by the JDK, because it has the risk of SIGSEGV when you access the mapped byte buffer after unmapping.
> We currently prevent this for the main IndexInput by setting its buffer to null, so we NPE if somebody tries to access the underlying buffer. I recently fixed also the stupid curBuf (LUCENE-3200) by setting to null.
> The big problem are cloned IndexInputs which are generally not closed. Those still contain references to the unmapped ByteBuffer, which lead to SIGSEGV easily. The patch from Mike in LUCENE-3439 prevents most of this in Lucene 3.5, but its still not 100% safe (as it uses non-volatiles).
> This patch will fix the remaining issues by also setting the buffers of clones to null when the original is closed. The trick is to record weak references of all clones created and close them together with the original. This uses a ConcurrentHashMap<WeakReference<MMapIndexInput>,?> as store with the logic borrowed from WeakHashMap to cleanup the GCed references (using ReferenceQueue).
> If we respin 3.5, we should maybe also get this in.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


[jira] [Updated] (LUCENE-3588) Try harder to prevent SIGSEGV on cloned MMapIndexInputs

Posted by "Uwe Schindler (Updated) (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/LUCENE-3588?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Uwe Schindler updated LUCENE-3588:
----------------------------------

    Attachment: LUCENE-3588-simpler.patch
                LUCENE-3588.patch

New patch, that no longer throws NPE, all NPEs are converted to AlreadyClosedExceptions in MMapIndexInput. This does not add overhead, the try/catch blocks are already there.

LUCENE-3588.patch is now the authoritative patch file.
                
> Try harder to prevent SIGSEGV on cloned MMapIndexInputs
> -------------------------------------------------------
>
>                 Key: LUCENE-3588
>                 URL: https://issues.apache.org/jira/browse/LUCENE-3588
>             Project: Lucene - Java
>          Issue Type: Improvement
>          Components: core/store
>    Affects Versions: 3.4, 3.5
>            Reporter: Uwe Schindler
>            Assignee: Uwe Schindler
>             Fix For: 3.6, 4.0
>
>         Attachments: LUCENE-3588-simpler.patch, LUCENE-3588-simpler.patch, LUCENE-3588-simpler.patch, LUCENE-3588.patch, LUCENE-3588.patch, LUCENE-3588.patch
>
>
> We are unmapping mmapped byte buffers which is disallowed by the JDK, because it has the risk of SIGSEGV when you access the mapped byte buffer after unmapping.
> We currently prevent this for the main IndexInput by setting its buffer to null, so we NPE if somebody tries to access the underlying buffer. I recently fixed also the stupid curBuf (LUCENE-3200) by setting to null.
> The big problem are cloned IndexInputs which are generally not closed. Those still contain references to the unmapped ByteBuffer, which lead to SIGSEGV easily. The patch from Mike in LUCENE-3439 prevents most of this in Lucene 3.5, but its still not 100% safe (as it uses non-volatiles).
> This patch will fix the remaining issues by also setting the buffers of clones to null when the original is closed. The trick is to record weak references of all clones created and close them together with the original. This uses a ConcurrentHashMap<WeakReference<MMapIndexInput>,?> as store with the logic borrowed from WeakHashMap to cleanup the GCed references (using ReferenceQueue).
> If we respin 3.5, we should maybe also get this in.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


[jira] [Commented] (LUCENE-3588) Try harder to prevent SIGSEGV on cloned MMapIndexInputs

Posted by "Uwe Schindler (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/LUCENE-3588?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13156863#comment-13156863 ] 

Uwe Schindler commented on LUCENE-3588:
---------------------------------------

I missed one more possible NPE -> AlreadyClosedException transformation in getFilePointer. Committed revs 1205954 (trunk), 1205956 (3x)
                
> Try harder to prevent SIGSEGV on cloned MMapIndexInputs
> -------------------------------------------------------
>
>                 Key: LUCENE-3588
>                 URL: https://issues.apache.org/jira/browse/LUCENE-3588
>             Project: Lucene - Java
>          Issue Type: Improvement
>          Components: core/store
>    Affects Versions: 3.4, 3.5
>            Reporter: Uwe Schindler
>            Assignee: Uwe Schindler
>             Fix For: 3.6, 4.0
>
>         Attachments: LUCENE-3588-simpler.patch, LUCENE-3588-simpler.patch, LUCENE-3588-simpler.patch, LUCENE-3588.patch, LUCENE-3588.patch, LUCENE-3588.patch
>
>
> We are unmapping mmapped byte buffers which is disallowed by the JDK, because it has the risk of SIGSEGV when you access the mapped byte buffer after unmapping.
> We currently prevent this for the main IndexInput by setting its buffer to null, so we NPE if somebody tries to access the underlying buffer. I recently fixed also the stupid curBuf (LUCENE-3200) by setting to null.
> The big problem are cloned IndexInputs which are generally not closed. Those still contain references to the unmapped ByteBuffer, which lead to SIGSEGV easily. The patch from Mike in LUCENE-3439 prevents most of this in Lucene 3.5, but its still not 100% safe (as it uses non-volatiles).
> This patch will fix the remaining issues by also setting the buffers of clones to null when the original is closed. The trick is to record weak references of all clones created and close them together with the original. This uses a ConcurrentHashMap<WeakReference<MMapIndexInput>,?> as store with the logic borrowed from WeakHashMap to cleanup the GCed references (using ReferenceQueue).
> If we respin 3.5, we should maybe also get this in.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


[jira] [Updated] (LUCENE-3588) Try harder to prevent SIGSEGV on cloned MMapIndexInputs

Posted by "Uwe Schindler (Updated) (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/LUCENE-3588?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Uwe Schindler updated LUCENE-3588:
----------------------------------

    Attachment: LUCENE-3588-simpler.patch

Here a much simplier patch than the one yesterday (including Robert's test):
The added complexity by ConcurrentHashMap with WeakReference and ReferenceQueue is nonsense, as CHM is optimized for many clients getting entries from the map. In our use-case the only one who gets entries from the map is our close() method. When cloning, we only call put() so its always synchronized by CHM and no difference to a standard synchronized WhateverMap.
This patch uses the simple apprach: Use a native WeakHashMap where we have a synchronization on the put()/close() cleanups. This removes all Reference handling and simplifies code a lot.

I think this is ready to commit.
                
> Try harder to prevent SIGSEGV on cloned MMapIndexInputs
> -------------------------------------------------------
>
>                 Key: LUCENE-3588
>                 URL: https://issues.apache.org/jira/browse/LUCENE-3588
>             Project: Lucene - Java
>          Issue Type: Improvement
>          Components: core/store
>    Affects Versions: 3.4, 3.5
>            Reporter: Uwe Schindler
>            Assignee: Uwe Schindler
>             Fix For: 3.6, 4.0
>
>         Attachments: LUCENE-3588-simpler.patch, LUCENE-3588.patch, LUCENE-3588.patch
>
>
> We are unmapping mmapped byte buffers which is disallowed by the JDK, because it has the risk of SIGSEGV when you access the mapped byte buffer after unmapping.
> We currently prevent this for the main IndexInput by setting its buffer to null, so we NPE if somebody tries to access the underlying buffer. I recently fixed also the stupid curBuf (LUCENE-3200) by setting to null.
> The big problem are cloned IndexInputs which are generally not closed. Those still contain references to the unmapped ByteBuffer, which lead to SIGSEGV easily. The patch from Mike in LUCENE-3439 prevents most of this in Lucene 3.5, but its still not 100% safe (as it uses non-volatiles).
> This patch will fix the remaining issues by also setting the buffers of clones to null when the original is closed. The trick is to record weak references of all clones created and close them together with the original. This uses a ConcurrentHashMap<WeakReference<MMapIndexInput>,?> as store with the logic borrowed from WeakHashMap to cleanup the GCed references (using ReferenceQueue).
> If we respin 3.5, we should maybe also get this in.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


[jira] [Updated] (LUCENE-3588) Try harder to prevent SIGSEGV on cloned MMapIndexInputs

Posted by "Uwe Schindler (Updated) (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/LUCENE-3588?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Uwe Schindler updated LUCENE-3588:
----------------------------------

    Attachment: LUCENE-3588-simpler.patch

Improved patch:
- The clones (even clones of clones) share all the same WeakHashMap with the original. Only the original MMapIndexInput will unset the buffers in all clones/cloned-clones.
- This reduces cost of creating clones (no HashMap instantiation, no ReferenceQueues,...)

Added test with clone of clone.
                
> Try harder to prevent SIGSEGV on cloned MMapIndexInputs
> -------------------------------------------------------
>
>                 Key: LUCENE-3588
>                 URL: https://issues.apache.org/jira/browse/LUCENE-3588
>             Project: Lucene - Java
>          Issue Type: Improvement
>          Components: core/store
>    Affects Versions: 3.4, 3.5
>            Reporter: Uwe Schindler
>            Assignee: Uwe Schindler
>             Fix For: 3.6, 4.0
>
>         Attachments: LUCENE-3588-simpler.patch, LUCENE-3588-simpler.patch, LUCENE-3588.patch, LUCENE-3588.patch
>
>
> We are unmapping mmapped byte buffers which is disallowed by the JDK, because it has the risk of SIGSEGV when you access the mapped byte buffer after unmapping.
> We currently prevent this for the main IndexInput by setting its buffer to null, so we NPE if somebody tries to access the underlying buffer. I recently fixed also the stupid curBuf (LUCENE-3200) by setting to null.
> The big problem are cloned IndexInputs which are generally not closed. Those still contain references to the unmapped ByteBuffer, which lead to SIGSEGV easily. The patch from Mike in LUCENE-3439 prevents most of this in Lucene 3.5, but its still not 100% safe (as it uses non-volatiles).
> This patch will fix the remaining issues by also setting the buffers of clones to null when the original is closed. The trick is to record weak references of all clones created and close them together with the original. This uses a ConcurrentHashMap<WeakReference<MMapIndexInput>,?> as store with the logic borrowed from WeakHashMap to cleanup the GCed references (using ReferenceQueue).
> If we respin 3.5, we should maybe also get this in.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


[jira] [Commented] (LUCENE-3588) Try harder to prevent SIGSEGV on cloned MMapIndexInputs

Posted by "Dawid Weiss (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/LUCENE-3588?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13155744#comment-13155744 ] 

Dawid Weiss commented on LUCENE-3588:
-------------------------------------

Looks good to me. Interesting solution.
                
> Try harder to prevent SIGSEGV on cloned MMapIndexInputs
> -------------------------------------------------------
>
>                 Key: LUCENE-3588
>                 URL: https://issues.apache.org/jira/browse/LUCENE-3588
>             Project: Lucene - Java
>          Issue Type: Improvement
>          Components: core/store
>    Affects Versions: 3.4, 3.5
>            Reporter: Uwe Schindler
>            Assignee: Uwe Schindler
>             Fix For: 3.6, 4.0
>
>         Attachments: LUCENE-3588.patch, LUCENE-3588.patch
>
>
> We are unmapping mmapped byte buffers which is disallowed by the JDK, because it has the risk of SIGSEGV when you access the mapped byte buffer after unmapping.
> We currently prevent this for the main IndexInput by setting its buffer to null, so we NPE if somebody tries to access the underlying buffer. I recently fixed also the stupid curBuf (LUCENE-3200) by setting to null.
> The big problem are cloned IndexInputs which are generally not closed. Those still contain references to the unmapped ByteBuffer, which lead to SIGSEGV easily. The patch from Mike in LUCENE-3439 prevents most of this in Lucene 3.5, but its still not 100% safe (as it uses non-volatiles).
> This patch will fix the remaining issues by also setting the buffers of clones to null when the original is closed. The trick is to record weak references of all clones created and close them together with the original. This uses a ConcurrentHashMap<WeakReference<MMapIndexInput>,?> as store with the logic borrowed from WeakHashMap to cleanup the GCed references (using ReferenceQueue).
> If we respin 3.5, we should maybe also get this in.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


[jira] [Updated] (LUCENE-3588) Try harder to prevent SIGSEGV on cloned MMapIndexInputs

Posted by "Robert Muir (Updated) (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/LUCENE-3588?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Robert Muir updated LUCENE-3588:
--------------------------------

    Attachment: LUCENE-3588.patch

+1, i added a simple test, sigsegv's without patch, passes with it.
                
> Try harder to prevent SIGSEGV on cloned MMapIndexInputs
> -------------------------------------------------------
>
>                 Key: LUCENE-3588
>                 URL: https://issues.apache.org/jira/browse/LUCENE-3588
>             Project: Lucene - Java
>          Issue Type: Improvement
>          Components: core/store
>    Affects Versions: 3.4, 3.5
>            Reporter: Uwe Schindler
>            Assignee: Uwe Schindler
>             Fix For: 3.6, 4.0
>
>         Attachments: LUCENE-3588.patch, LUCENE-3588.patch
>
>
> We are unmapping mmapped byte buffers which is disallowed by the JDK, because it has the risk of SIGSEGV when you access the mapped byte buffer after unmapping.
> We currently prevent this for the main IndexInput by setting its buffer to null, so we NPE if somebody tries to access the underlying buffer. I recently fixed also the stupid curBuf (LUCENE-3200) by setting to null.
> The big problem are cloned IndexInputs which are generally not closed. Those still contain references to the unmapped ByteBuffer, which lead to SIGSEGV easily. The patch from Mike in LUCENE-3439 prevents most of this in Lucene 3.5, but its still not 100% safe (as it uses non-volatiles).
> This patch will fix the remaining issues by also setting the buffers of clones to null when the original is closed. The trick is to record weak references of all clones created and close them together with the original. This uses a ConcurrentHashMap<WeakReference<MMapIndexInput>,?> as store with the logic borrowed from WeakHashMap to cleanup the GCed references (using ReferenceQueue).
> If we respin 3.5, we should maybe also get this in.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org