You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@lucene.apache.org by "Uwe Schindler (JIRA)" <ji...@apache.org> on 2012/06/22 10:31:42 UTC

[jira] [Created] (LUCENE-4163) Improve concurrency in MMapIndexInput.clone()

Uwe Schindler created LUCENE-4163:
-------------------------------------

             Summary: Improve concurrency in MMapIndexInput.clone()
                 Key: LUCENE-4163
                 URL: https://issues.apache.org/jira/browse/LUCENE-4163
             Project: Lucene - Java
          Issue Type: Improvement
          Components: core/store
    Affects Versions: 3.6, 4.0, 5.0
            Reporter: Uwe Schindler
            Assignee: Uwe Schindler
             Fix For: 4.0, 3.6.1, 5.0


Followup issue from SOLR-3566:

Whenever you clone the TermIndex, it also creates a clone of the underlying IndexInputs. In high cocurrent environments, the clone method of MMapIndexInput is a bottleneck (it has heavy work to do to manage the weak references in a synchronized block).

Everywhere else in Lucene we use my new WeakIdentityMap for managing concurrent weak maps. For this case I did not do this, as the WeakIdentityMap has no iterators (it doe snot implement Map interface). This issue will add a key and values iterator (the key iterator will not return GC'ed keys), so MMapIndexInput can use WeakIdentityMap backed by ConcurrentHashMap and needs no synchronization. ConcurrentHashMap has better concurrency because it distributes the hash keys in different buckets per thread.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


[jira] [Updated] (LUCENE-4163) Improve concurrency in MMapIndexInput.clone()

Posted by "Uwe Schindler (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/LUCENE-4163?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Uwe Schindler updated LUCENE-4163:
----------------------------------

    Attachment: LUCENE-4163.patch

Patch.

This patch also refactors the close() method of MMapIndexInput a little bit to work around the issue that we have no synchronization anymore. It will mark the IndexInput as closed (buffers = null) as first step, so later clone() or other access fails with AlreadyClosedException. After unsetting the buffers it will unset all clone buffers and finally unmap them.
                
> Improve concurrency in MMapIndexInput.clone()
> ---------------------------------------------
>
>                 Key: LUCENE-4163
>                 URL: https://issues.apache.org/jira/browse/LUCENE-4163
>             Project: Lucene - Java
>          Issue Type: Improvement
>          Components: core/store
>    Affects Versions: 3.6, 4.0, 5.0
>            Reporter: Uwe Schindler
>            Assignee: Uwe Schindler
>             Fix For: 4.0, 3.6.1, 5.0
>
>         Attachments: LUCENE-4163.patch
>
>
> Followup issue from SOLR-3566:
> Whenever you clone the TermIndex, it also creates a clone of the underlying IndexInputs. In high cocurrent environments, the clone method of MMapIndexInput is a bottleneck (it has heavy work to do to manage the weak references in a synchronized block).
> Everywhere else in Lucene we use my new WeakIdentityMap for managing concurrent weak maps. For this case I did not do this, as the WeakIdentityMap has no iterators (it doe snot implement Map interface). This issue will add a key and values iterator (the key iterator will not return GC'ed keys), so MMapIndexInput can use WeakIdentityMap backed by ConcurrentHashMap and needs no synchronization. ConcurrentHashMap has better concurrency because it distributes the hash keys in different buckets per thread.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


[jira] [Updated] (LUCENE-4163) Improve concurrency in MMapIndexInput.clone()

Posted by "Uwe Schindler (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/LUCENE-4163?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Uwe Schindler updated LUCENE-4163:
----------------------------------

    Attachment:     (was: LUCENE-4163.patch)
    
> Improve concurrency in MMapIndexInput.clone()
> ---------------------------------------------
>
>                 Key: LUCENE-4163
>                 URL: https://issues.apache.org/jira/browse/LUCENE-4163
>             Project: Lucene - Java
>          Issue Type: Improvement
>          Components: core/store
>    Affects Versions: 3.6, 4.0, 5.0
>            Reporter: Uwe Schindler
>            Assignee: Uwe Schindler
>             Fix For: 4.0, 3.6.1, 5.0
>
>         Attachments: LUCENE-4163.patch, LUCENE-4163.patch, LUCENE-4163.patch
>
>
> Followup issue from SOLR-3566:
> Whenever you clone the TermIndex, it also creates a clone of the underlying IndexInputs. In high cocurrent environments, the clone method of MMapIndexInput is a bottleneck (it has heavy work to do to manage the weak references in a synchronized block).
> Everywhere else in Lucene we use my new WeakIdentityMap for managing concurrent weak maps. For this case I did not do this, as the WeakIdentityMap has no iterators (it doe snot implement Map interface). This issue will add a key and values iterator (the key iterator will not return GC'ed keys), so MMapIndexInput can use WeakIdentityMap backed by ConcurrentHashMap and needs no synchronization. ConcurrentHashMap has better concurrency because it distributes the hash keys in different buckets per thread.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


[jira] [Updated] (LUCENE-4163) Improve concurrency in MMapIndexInput.clone()

Posted by "Uwe Schindler (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/LUCENE-4163?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Uwe Schindler updated LUCENE-4163:
----------------------------------

    Attachment: LUCENE-4163.patch

Patch with updated Javadocs.
                
> Improve concurrency in MMapIndexInput.clone()
> ---------------------------------------------
>
>                 Key: LUCENE-4163
>                 URL: https://issues.apache.org/jira/browse/LUCENE-4163
>             Project: Lucene - Java
>          Issue Type: Improvement
>          Components: core/store
>    Affects Versions: 3.6, 4.0, 5.0
>            Reporter: Uwe Schindler
>            Assignee: Uwe Schindler
>             Fix For: 4.0, 3.6.1, 5.0
>
>         Attachments: LUCENE-4163.patch, LUCENE-4163.patch, LUCENE-4163.patch
>
>
> Followup issue from SOLR-3566:
> Whenever you clone the TermIndex, it also creates a clone of the underlying IndexInputs. In high cocurrent environments, the clone method of MMapIndexInput is a bottleneck (it has heavy work to do to manage the weak references in a synchronized block).
> Everywhere else in Lucene we use my new WeakIdentityMap for managing concurrent weak maps. For this case I did not do this, as the WeakIdentityMap has no iterators (it doe snot implement Map interface). This issue will add a key and values iterator (the key iterator will not return GC'ed keys), so MMapIndexInput can use WeakIdentityMap backed by ConcurrentHashMap and needs no synchronization. ConcurrentHashMap has better concurrency because it distributes the hash keys in different buckets per thread.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


[jira] [Commented] (LUCENE-4163) Improve concurrency in MMapIndexInput.clone()

Posted by "Uwe Schindler (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/LUCENE-4163?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13399231#comment-13399231 ] 

Uwe Schindler commented on LUCENE-4163:
---------------------------------------

Adrien: right, will do!
                
> Improve concurrency in MMapIndexInput.clone()
> ---------------------------------------------
>
>                 Key: LUCENE-4163
>                 URL: https://issues.apache.org/jira/browse/LUCENE-4163
>             Project: Lucene - Java
>          Issue Type: Improvement
>          Components: core/store
>    Affects Versions: 3.6, 4.0, 5.0
>            Reporter: Uwe Schindler
>            Assignee: Uwe Schindler
>             Fix For: 4.0, 3.6.1, 5.0
>
>         Attachments: LUCENE-4163.patch, LUCENE-4163.patch
>
>
> Followup issue from SOLR-3566:
> Whenever you clone the TermIndex, it also creates a clone of the underlying IndexInputs. In high cocurrent environments, the clone method of MMapIndexInput is a bottleneck (it has heavy work to do to manage the weak references in a synchronized block).
> Everywhere else in Lucene we use my new WeakIdentityMap for managing concurrent weak maps. For this case I did not do this, as the WeakIdentityMap has no iterators (it doe snot implement Map interface). This issue will add a key and values iterator (the key iterator will not return GC'ed keys), so MMapIndexInput can use WeakIdentityMap backed by ConcurrentHashMap and needs no synchronization. ConcurrentHashMap has better concurrency because it distributes the hash keys in different buckets per thread.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


[jira] [Updated] (LUCENE-4163) Improve concurrency in MMapIndexInput.clone()

Posted by "Uwe Schindler (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/LUCENE-4163?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Uwe Schindler updated LUCENE-4163:
----------------------------------

    Attachment: LUCENE-4163.patch
    
> Improve concurrency in MMapIndexInput.clone()
> ---------------------------------------------
>
>                 Key: LUCENE-4163
>                 URL: https://issues.apache.org/jira/browse/LUCENE-4163
>             Project: Lucene - Java
>          Issue Type: Improvement
>          Components: core/store
>    Affects Versions: 3.6, 4.0, 5.0
>            Reporter: Uwe Schindler
>            Assignee: Uwe Schindler
>             Fix For: 4.0, 3.6.1, 5.0
>
>         Attachments: LUCENE-4163.patch, LUCENE-4163.patch, LUCENE-4163.patch
>
>
> Followup issue from SOLR-3566:
> Whenever you clone the TermIndex, it also creates a clone of the underlying IndexInputs. In high cocurrent environments, the clone method of MMapIndexInput is a bottleneck (it has heavy work to do to manage the weak references in a synchronized block).
> Everywhere else in Lucene we use my new WeakIdentityMap for managing concurrent weak maps. For this case I did not do this, as the WeakIdentityMap has no iterators (it doe snot implement Map interface). This issue will add a key and values iterator (the key iterator will not return GC'ed keys), so MMapIndexInput can use WeakIdentityMap backed by ConcurrentHashMap and needs no synchronization. ConcurrentHashMap has better concurrency because it distributes the hash keys in different buckets per thread.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


[jira] [Updated] (LUCENE-4163) Improve concurrency in MMapIndexInput.clone()

Posted by "Uwe Schindler (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/LUCENE-4163?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Uwe Schindler updated LUCENE-4163:
----------------------------------

    Fix Version/s:     (was: 3.6.1)

For now I will not commit this to 3.6.1, it may be too risky (especially as it is Java 5 and ConcurrentHashMap may deadlock sometimes under heavy load).

If this should be backported reopen before release.
                
> Improve concurrency in MMapIndexInput.clone()
> ---------------------------------------------
>
>                 Key: LUCENE-4163
>                 URL: https://issues.apache.org/jira/browse/LUCENE-4163
>             Project: Lucene - Java
>          Issue Type: Improvement
>          Components: core/store
>    Affects Versions: 3.6, 4.0, 5.0
>            Reporter: Uwe Schindler
>            Assignee: Uwe Schindler
>             Fix For: 4.0, 5.0
>
>         Attachments: LUCENE-4163.patch, LUCENE-4163.patch, LUCENE-4163.patch
>
>
> Followup issue from SOLR-3566:
> Whenever you clone the TermIndex, it also creates a clone of the underlying IndexInputs. In high cocurrent environments, the clone method of MMapIndexInput is a bottleneck (it has heavy work to do to manage the weak references in a synchronized block).
> Everywhere else in Lucene we use my new WeakIdentityMap for managing concurrent weak maps. For this case I did not do this, as the WeakIdentityMap has no iterators (it doe snot implement Map interface). This issue will add a key and values iterator (the key iterator will not return GC'ed keys), so MMapIndexInput can use WeakIdentityMap backed by ConcurrentHashMap and needs no synchronization. ConcurrentHashMap has better concurrency because it distributes the hash keys in different buckets per thread.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


[jira] [Updated] (LUCENE-4163) Improve concurrency in MMapIndexInput.clone()

Posted by "Uwe Schindler (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/LUCENE-4163?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Uwe Schindler updated LUCENE-4163:
----------------------------------

    Attachment: LUCENE-4163.patch

Slightly improved test (null keys and iterator conformance).

I think that's ready to commit and brings a big improvement in concurrency. We should backport this to 3.6.1!
                
> Improve concurrency in MMapIndexInput.clone()
> ---------------------------------------------
>
>                 Key: LUCENE-4163
>                 URL: https://issues.apache.org/jira/browse/LUCENE-4163
>             Project: Lucene - Java
>          Issue Type: Improvement
>          Components: core/store
>    Affects Versions: 3.6, 4.0, 5.0
>            Reporter: Uwe Schindler
>            Assignee: Uwe Schindler
>             Fix For: 4.0, 3.6.1, 5.0
>
>         Attachments: LUCENE-4163.patch, LUCENE-4163.patch
>
>
> Followup issue from SOLR-3566:
> Whenever you clone the TermIndex, it also creates a clone of the underlying IndexInputs. In high cocurrent environments, the clone method of MMapIndexInput is a bottleneck (it has heavy work to do to manage the weak references in a synchronized block).
> Everywhere else in Lucene we use my new WeakIdentityMap for managing concurrent weak maps. For this case I did not do this, as the WeakIdentityMap has no iterators (it doe snot implement Map interface). This issue will add a key and values iterator (the key iterator will not return GC'ed keys), so MMapIndexInput can use WeakIdentityMap backed by ConcurrentHashMap and needs no synchronization. ConcurrentHashMap has better concurrency because it distributes the hash keys in different buckets per thread.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


[jira] [Commented] (LUCENE-4163) Improve concurrency in MMapIndexInput.clone()

Posted by "Adrien Grand (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/LUCENE-4163?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13399229#comment-13399229 ] 

Adrien Grand commented on LUCENE-4163:
--------------------------------------

+1

Maybe we should also update {{WeakIdentityMap}} documentation now that it has key and value iterators:

bq. This implementation was forked from <a href="http://cxf.apache.org/">Apache CXF</a> but modified to <b>not</b> implement the {@link java.util.Map} interface and without any set/iterator views on it, as those are error-prone and inefficient
                
> Improve concurrency in MMapIndexInput.clone()
> ---------------------------------------------
>
>                 Key: LUCENE-4163
>                 URL: https://issues.apache.org/jira/browse/LUCENE-4163
>             Project: Lucene - Java
>          Issue Type: Improvement
>          Components: core/store
>    Affects Versions: 3.6, 4.0, 5.0
>            Reporter: Uwe Schindler
>            Assignee: Uwe Schindler
>             Fix For: 4.0, 3.6.1, 5.0
>
>         Attachments: LUCENE-4163.patch, LUCENE-4163.patch
>
>
> Followup issue from SOLR-3566:
> Whenever you clone the TermIndex, it also creates a clone of the underlying IndexInputs. In high cocurrent environments, the clone method of MMapIndexInput is a bottleneck (it has heavy work to do to manage the weak references in a synchronized block).
> Everywhere else in Lucene we use my new WeakIdentityMap for managing concurrent weak maps. For this case I did not do this, as the WeakIdentityMap has no iterators (it doe snot implement Map interface). This issue will add a key and values iterator (the key iterator will not return GC'ed keys), so MMapIndexInput can use WeakIdentityMap backed by ConcurrentHashMap and needs no synchronization. ConcurrentHashMap has better concurrency because it distributes the hash keys in different buckets per thread.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


[jira] [Resolved] (LUCENE-4163) Improve concurrency in MMapIndexInput.clone()

Posted by "Uwe Schindler (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/LUCENE-4163?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Uwe Schindler resolved LUCENE-4163.
-----------------------------------

    Resolution: Fixed

Committed trunk revision: 1353101
Committed 4.x branch revision: 1353102
                
> Improve concurrency in MMapIndexInput.clone()
> ---------------------------------------------
>
>                 Key: LUCENE-4163
>                 URL: https://issues.apache.org/jira/browse/LUCENE-4163
>             Project: Lucene - Java
>          Issue Type: Improvement
>          Components: core/store
>    Affects Versions: 3.6, 4.0, 5.0
>            Reporter: Uwe Schindler
>            Assignee: Uwe Schindler
>             Fix For: 4.0, 5.0
>
>         Attachments: LUCENE-4163.patch, LUCENE-4163.patch, LUCENE-4163.patch
>
>
> Followup issue from SOLR-3566:
> Whenever you clone the TermIndex, it also creates a clone of the underlying IndexInputs. In high cocurrent environments, the clone method of MMapIndexInput is a bottleneck (it has heavy work to do to manage the weak references in a synchronized block).
> Everywhere else in Lucene we use my new WeakIdentityMap for managing concurrent weak maps. For this case I did not do this, as the WeakIdentityMap has no iterators (it doe snot implement Map interface). This issue will add a key and values iterator (the key iterator will not return GC'ed keys), so MMapIndexInput can use WeakIdentityMap backed by ConcurrentHashMap and needs no synchronization. ConcurrentHashMap has better concurrency because it distributes the hash keys in different buckets per thread.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org