You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@jackrabbit.apache.org by "Antonio Martinez (JIRA)" <ji...@apache.org> on 2009/12/07 22:20:19 UTC

[jira] Created: (JCR-2426) Deadlock in lucene (Jackrabbit 1.4.4)

Deadlock in lucene (Jackrabbit 1.4.4)
-------------------------------------

                 Key: JCR-2426
                 URL: https://issues.apache.org/jira/browse/JCR-2426
             Project: Jackrabbit Content Repository
          Issue Type: Bug
          Components: indexing
    Affects Versions: core 1.4.4
            Reporter: Antonio Martinez
            Priority: Critical
             Fix For: core 1.4.4


We get a deadlock in lucene part of jackrabbit (see deadlock_summary.txt)
This issue has been observed in a production setup running Jackrabbit 1.4.4 in cluster configuration

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (JCR-2426) Deadlock in lucene (Jackrabbit 1.4.4)

Posted by "Antonio Martinez (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/JCR-2426?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Antonio Martinez updated JCR-2426:
----------------------------------

    Description: 
We get a deadlock in lucene part of jackrabbit (see deadlock_summary.txt)
This issue has been observed in two different production setups running Jackrabbit 1.4.4 in cluster configuration


  was:
We get a deadlock in lucene part of jackrabbit (see deadlock_summary.txt)
This issue has been observed in a production setup running Jackrabbit 1.4.4 in cluster configuration


> Deadlock in lucene (Jackrabbit 1.4.4)
> -------------------------------------
>
>                 Key: JCR-2426
>                 URL: https://issues.apache.org/jira/browse/JCR-2426
>             Project: Jackrabbit Content Repository
>          Issue Type: Bug
>          Components: indexing
>    Affects Versions: core 1.4.4
>            Reporter: Antonio Martinez
>            Priority: Blocker
>         Attachments: deadlock_summary.txt
>
>
> We get a deadlock in lucene part of jackrabbit (see deadlock_summary.txt)
> This issue has been observed in two different production setups running Jackrabbit 1.4.4 in cluster configuration

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (JCR-2426) Deadlock in lucene (Jackrabbit 1.4.4)

Posted by "Antonio Martinez (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/JCR-2426?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12833467#action_12833467 ] 

Antonio Martinez commented on JCR-2426:
---------------------------------------

When analyzing the threaddump with TDA we see:
    This thread dump contains monitors without a locking thread information. This means, the monitor is hold by a system thread or some external resource.

The class involved "FSIndexInput" uses "Descriptor" class, which has overridden the finalize method and eventually calls RandomAccessFile "close" and "finalize" methods.
Therefore the is possible that the holding thread is a system thead

> Deadlock in lucene (Jackrabbit 1.4.4)
> -------------------------------------
>
>                 Key: JCR-2426
>                 URL: https://issues.apache.org/jira/browse/JCR-2426
>             Project: Jackrabbit Content Repository
>          Issue Type: Bug
>          Components: indexing
>    Affects Versions: core 1.4.4
>            Reporter: Antonio Martinez
>            Priority: Critical
>         Attachments: deadlock_summary.txt
>
>
> We get a deadlock in lucene part of jackrabbit (see deadlock_summary.txt)
> This issue has been observed in a production setup running Jackrabbit 1.4.4 in cluster configuration

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (JCR-2426) Deadlock in lucene (Jackrabbit 1.4.4)

Posted by "Jukka Zitting (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/JCR-2426?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Jukka Zitting updated JCR-2426:
-------------------------------

    Fix Version/s:     (was: core 1.4.4)

Jackrabbit 1.4 is a fairly old version. Can you upgrade to Jackrabbit 1.6? Does the problem still occur with 1.6?

> Deadlock in lucene (Jackrabbit 1.4.4)
> -------------------------------------
>
>                 Key: JCR-2426
>                 URL: https://issues.apache.org/jira/browse/JCR-2426
>             Project: Jackrabbit Content Repository
>          Issue Type: Bug
>          Components: indexing
>    Affects Versions: core 1.4.4
>            Reporter: Antonio Martinez
>            Priority: Critical
>         Attachments: deadlock_summary.txt
>
>
> We get a deadlock in lucene part of jackrabbit (see deadlock_summary.txt)
> This issue has been observed in a production setup running Jackrabbit 1.4.4 in cluster configuration

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (JCR-2426) Deadlock in lucene (Jackrabbit 1.4.4)

Posted by "Antonio Martinez (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/JCR-2426?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12788335#action_12788335 ] 

Antonio Martinez commented on JCR-2426:
---------------------------------------

We are using JVM version 1.6.0_06-b02 for Solaris SPARC

bash-3.00$ cat /etc/release
                      Solaris 10 10/08 s10s_u6wos_07b SPARC
           Copyright 2008 Sun Microsystems, Inc.  All Rights Reserved.
                        Use is subject to license terms.
                            Assembled 27 October 2008 
bash-3.00$ uname -a
SunOS op06udb1 5.10 Generic_138888-03 sun4v sparc SUNW,Sun-Blade-T6340


> Deadlock in lucene (Jackrabbit 1.4.4)
> -------------------------------------
>
>                 Key: JCR-2426
>                 URL: https://issues.apache.org/jira/browse/JCR-2426
>             Project: Jackrabbit Content Repository
>          Issue Type: Bug
>          Components: indexing
>    Affects Versions: core 1.4.4
>            Reporter: Antonio Martinez
>            Priority: Critical
>             Fix For: core 1.4.4
>
>         Attachments: deadlock_summary.txt
>
>
> We get a deadlock in lucene part of jackrabbit (see deadlock_summary.txt)
> This issue has been observed in a production setup running Jackrabbit 1.4.4 in cluster configuration

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (JCR-2426) Deadlock in lucene (Jackrabbit 1.4.4)

Posted by "Michael McCandless (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/JCR-2426?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12833557#action_12833557 ] 

Michael McCandless commented on JCR-2426:
-----------------------------------------

That's a dangerous JRE to use with Lucene -- 1.6 JREs less than _10
can cause index corruption due to a known Sun hotspot compiler bug
(see LUCENE-1282, referenced from
http://wiki.apache.org/lucene-java/SunJavaBugs).

Also, consider switching to NIOFSDir (if you're not on windows) -- it
avoids contention when multiple threads want to read from the same
file.

As of 2.9, Lucene has removed the
FSDirectory$FSIndexInput$Descriptor's finalize method.  But, even so,
that finalize method is only invoked when the instance is being GC'd,
which makes no sense given that there are other threads actively using
it for searching/merging.  I think we need something else to explain
why the thread dump seems to incorrectly claim that
"jmssecondaryApplnJobExecutor-7" is holding the lock...


> Deadlock in lucene (Jackrabbit 1.4.4)
> -------------------------------------
>
>                 Key: JCR-2426
>                 URL: https://issues.apache.org/jira/browse/JCR-2426
>             Project: Jackrabbit Content Repository
>          Issue Type: Bug
>          Components: indexing
>    Affects Versions: core 1.4.4
>            Reporter: Antonio Martinez
>            Priority: Blocker
>         Attachments: deadlock_summary.txt
>
>
> We get a deadlock in lucene part of jackrabbit (see deadlock_summary.txt)
> This issue has been observed in a production setup running Jackrabbit 1.4.4 in cluster configuration

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (JCR-2426) Deadlock in lucene (Jackrabbit 1.4.4)

Posted by "Jukka Zitting (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/JCR-2426?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Jukka Zitting updated JCR-2426:
-------------------------------

    Priority: Minor  (was: Blocker)

Reducing priority as this doesn't seem to affect more recent Jackrabbit versions. I'd resolve this as Won't Fix unless someone comes up with a repeatable test case and a patch for fixing this. The recommended solution is to upgrade your deployment.

The 1.6.0 deadlock you noticed was fixed in JCR-2525.

> Deadlock in lucene (Jackrabbit 1.4.4)
> -------------------------------------
>
>                 Key: JCR-2426
>                 URL: https://issues.apache.org/jira/browse/JCR-2426
>             Project: Jackrabbit Content Repository
>          Issue Type: Bug
>          Components: indexing
>    Affects Versions: core 1.4.4
>            Reporter: Antonio Martinez
>            Priority: Minor
>         Attachments: deadlock_2nd_setup.txt, deadlock_jackrabbit1.6.txt, deadlock_summary.txt
>
>
> We get a deadlock in lucene part of jackrabbit (see deadlock_summary.txt)
> This issue has been observed in two different production setups running Jackrabbit 1.4.4 in cluster configuration

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (JCR-2426) Deadlock in lucene (Jackrabbit 1.4.4)

Posted by "Antonio Martinez (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/JCR-2426?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Antonio Martinez updated JCR-2426:
----------------------------------

    Attachment: deadlock_summary.txt

> Deadlock in lucene (Jackrabbit 1.4.4)
> -------------------------------------
>
>                 Key: JCR-2426
>                 URL: https://issues.apache.org/jira/browse/JCR-2426
>             Project: Jackrabbit Content Repository
>          Issue Type: Bug
>          Components: indexing
>    Affects Versions: core 1.4.4
>            Reporter: Antonio Martinez
>            Priority: Critical
>             Fix For: core 1.4.4
>
>         Attachments: deadlock_summary.txt
>
>
> We get a deadlock in lucene part of jackrabbit (see deadlock_summary.txt)
> This issue has been observed in a production setup running Jackrabbit 1.4.4 in cluster configuration

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (JCR-2426) Deadlock in lucene (Jackrabbit 1.4.4)

Posted by "Antonio Martinez (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/JCR-2426?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Antonio Martinez updated JCR-2426:
----------------------------------

    Attachment: deadlock_2nd_setup.txt

Both setups have same JVM version and the thread dump (see deadlock_2nd_setup.txt).

> Deadlock in lucene (Jackrabbit 1.4.4)
> -------------------------------------
>
>                 Key: JCR-2426
>                 URL: https://issues.apache.org/jira/browse/JCR-2426
>             Project: Jackrabbit Content Repository
>          Issue Type: Bug
>          Components: indexing
>    Affects Versions: core 1.4.4
>            Reporter: Antonio Martinez
>            Priority: Blocker
>         Attachments: deadlock_2nd_setup.txt, deadlock_summary.txt
>
>
> We get a deadlock in lucene part of jackrabbit (see deadlock_summary.txt)
> This issue has been observed in two different production setups running Jackrabbit 1.4.4 in cluster configuration

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (JCR-2426) Deadlock in lucene (Jackrabbit 1.4.4)

Posted by "Marcel Reutegger (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/JCR-2426?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12835152#action_12835152 ] 

Marcel Reutegger commented on JCR-2426:
---------------------------------------

Could you please retry with a more recent version of Java 1.6? Thanks.

> Deadlock in lucene (Jackrabbit 1.4.4)
> -------------------------------------
>
>                 Key: JCR-2426
>                 URL: https://issues.apache.org/jira/browse/JCR-2426
>             Project: Jackrabbit Content Repository
>          Issue Type: Bug
>          Components: indexing
>    Affects Versions: core 1.4.4
>            Reporter: Antonio Martinez
>            Priority: Blocker
>         Attachments: deadlock_2nd_setup.txt, deadlock_summary.txt
>
>
> We get a deadlock in lucene part of jackrabbit (see deadlock_summary.txt)
> This issue has been observed in two different production setups running Jackrabbit 1.4.4 in cluster configuration

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (JCR-2426) Deadlock in lucene (Jackrabbit 1.4.4)

Posted by "Marcel Reutegger (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/JCR-2426?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12833769#action_12833769 ] 

Marcel Reutegger commented on JCR-2426:
---------------------------------------

> This issue has happened in another setup

can you please provide more details on that setup? JVM version, thread dumps, etc.

> Deadlock in lucene (Jackrabbit 1.4.4)
> -------------------------------------
>
>                 Key: JCR-2426
>                 URL: https://issues.apache.org/jira/browse/JCR-2426
>             Project: Jackrabbit Content Repository
>          Issue Type: Bug
>          Components: indexing
>    Affects Versions: core 1.4.4
>            Reporter: Antonio Martinez
>            Priority: Blocker
>         Attachments: deadlock_summary.txt
>
>
> We get a deadlock in lucene part of jackrabbit (see deadlock_summary.txt)
> This issue has been observed in two different production setups running Jackrabbit 1.4.4 in cluster configuration

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (JCR-2426) Deadlock in lucene (Jackrabbit 1.4.4)

Posted by "Antonio Martinez (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/JCR-2426?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Antonio Martinez updated JCR-2426:
----------------------------------

    Attachment: deadlock_jackrabbit1.6.txt

> Deadlock in lucene (Jackrabbit 1.4.4)
> -------------------------------------
>
>                 Key: JCR-2426
>                 URL: https://issues.apache.org/jira/browse/JCR-2426
>             Project: Jackrabbit Content Repository
>          Issue Type: Bug
>          Components: indexing
>    Affects Versions: core 1.4.4
>            Reporter: Antonio Martinez
>            Priority: Blocker
>         Attachments: deadlock_2nd_setup.txt, deadlock_jackrabbit1.6.txt, deadlock_summary.txt
>
>
> We get a deadlock in lucene part of jackrabbit (see deadlock_summary.txt)
> This issue has been observed in two different production setups running Jackrabbit 1.4.4 in cluster configuration

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (JCR-2426) Deadlock in lucene (Jackrabbit 1.4.4)

Posted by "Marcel Reutegger (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/JCR-2426?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12787993#action_12787993 ] 

Marcel Reutegger commented on JCR-2426:
---------------------------------------

That looks very strange.

The JVM says:

"IndexMerger":
  waiting to lock monitor 0x03764818 (object 0x80c376b0, a org.apache.lucene.store.FSDirectory$FSIndexInput),
  which is held by "jmssecondaryApplnJobExecutor-7"

this is not correct. "jmssecondaryApplnJobExecutor-7" does not hold this lock.

This might be a JVM issue. What kind of JVM are you using? See also related reports here:
http://forums.sun.com/thread.jspa?messageID=2941439


> Deadlock in lucene (Jackrabbit 1.4.4)
> -------------------------------------
>
>                 Key: JCR-2426
>                 URL: https://issues.apache.org/jira/browse/JCR-2426
>             Project: Jackrabbit Content Repository
>          Issue Type: Bug
>          Components: indexing
>    Affects Versions: core 1.4.4
>            Reporter: Antonio Martinez
>            Priority: Critical
>             Fix For: core 1.4.4
>
>         Attachments: deadlock_summary.txt
>
>
> We get a deadlock in lucene part of jackrabbit (see deadlock_summary.txt)
> This issue has been observed in a production setup running Jackrabbit 1.4.4 in cluster configuration

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (JCR-2426) Deadlock in lucene (Jackrabbit 1.4.4)

Posted by "Antonio Martinez (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/JCR-2426?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Antonio Martinez updated JCR-2426:
----------------------------------

    Priority: Blocker  (was: Critical)

> Deadlock in lucene (Jackrabbit 1.4.4)
> -------------------------------------
>
>                 Key: JCR-2426
>                 URL: https://issues.apache.org/jira/browse/JCR-2426
>             Project: Jackrabbit Content Repository
>          Issue Type: Bug
>          Components: indexing
>    Affects Versions: core 1.4.4
>            Reporter: Antonio Martinez
>            Priority: Blocker
>         Attachments: deadlock_summary.txt
>
>
> We get a deadlock in lucene part of jackrabbit (see deadlock_summary.txt)
> This issue has been observed in a production setup running Jackrabbit 1.4.4 in cluster configuration

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (JCR-2426) Deadlock in lucene (Jackrabbit 1.4.4)

Posted by "Antonio Martinez (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/JCR-2426?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12839069#action_12839069 ] 

Antonio Martinez commented on JCR-2426:
---------------------------------------

Moving to a newer version of Java is too risky for customer now.

What I have done is to move to newer version of Jackrabbit 1.6.0 (and I also used NIOFS). With this I'm getting a different deadlock - see deadlock_jackrabbit1.6.0.txt (again only seen in performance setup after some time with high activity in the JCR)

> Deadlock in lucene (Jackrabbit 1.4.4)
> -------------------------------------
>
>                 Key: JCR-2426
>                 URL: https://issues.apache.org/jira/browse/JCR-2426
>             Project: Jackrabbit Content Repository
>          Issue Type: Bug
>          Components: indexing
>    Affects Versions: core 1.4.4
>            Reporter: Antonio Martinez
>            Priority: Blocker
>         Attachments: deadlock_2nd_setup.txt, deadlock_summary.txt
>
>
> We get a deadlock in lucene part of jackrabbit (see deadlock_summary.txt)
> This issue has been observed in two different production setups running Jackrabbit 1.4.4 in cluster configuration

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (JCR-2426) Deadlock in lucene (Jackrabbit 1.4.4)

Posted by "Antonio Martinez (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/JCR-2426?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12833710#action_12833710 ] 

Antonio Martinez commented on JCR-2426:
---------------------------------------

This issue has happened in another setup 

Question - Is is known if  Lucene 2.9 compatible with Jackrabbit 1.4 ?

> Deadlock in lucene (Jackrabbit 1.4.4)
> -------------------------------------
>
>                 Key: JCR-2426
>                 URL: https://issues.apache.org/jira/browse/JCR-2426
>             Project: Jackrabbit Content Repository
>          Issue Type: Bug
>          Components: indexing
>    Affects Versions: core 1.4.4
>            Reporter: Antonio Martinez
>            Priority: Blocker
>         Attachments: deadlock_summary.txt
>
>
> We get a deadlock in lucene part of jackrabbit (see deadlock_summary.txt)
> This issue has been observed in two different production setups running Jackrabbit 1.4.4 in cluster configuration

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (JCR-2426) Deadlock in lucene (Jackrabbit 1.4.4)

Posted by "Marcel Reutegger (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/JCR-2426?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12833768#action_12833768 ] 

Marcel Reutegger commented on JCR-2426:
---------------------------------------

No, Jackrabbit 1.4.x only works with lucene-core 2.2.

> Deadlock in lucene (Jackrabbit 1.4.4)
> -------------------------------------
>
>                 Key: JCR-2426
>                 URL: https://issues.apache.org/jira/browse/JCR-2426
>             Project: Jackrabbit Content Repository
>          Issue Type: Bug
>          Components: indexing
>    Affects Versions: core 1.4.4
>            Reporter: Antonio Martinez
>            Priority: Blocker
>         Attachments: deadlock_summary.txt
>
>
> We get a deadlock in lucene part of jackrabbit (see deadlock_summary.txt)
> This issue has been observed in two different production setups running Jackrabbit 1.4.4 in cluster configuration

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.