You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@lucene.apache.org by "Shai Erera (JIRA)" <ji...@apache.org> on 2010/04/29 15:16:54 UTC

[jira] Created: (LUCENE-2421) Hardening of NativeFSLock

Hardening of NativeFSLock
-------------------------

                 Key: LUCENE-2421
                 URL: https://issues.apache.org/jira/browse/LUCENE-2421
             Project: Lucene - Java
          Issue Type: Improvement
          Components: Index
            Reporter: Shai Erera
            Assignee: Shai Erera
             Fix For: 3.1


NativeFSLock create a test lock file which its name might collide w/ another JVM that is running. Very unlikely, but still it happened a couple of times already, since the tests were parallelized. This may result in a false exception thrown from release(), when the lock file's delete() is called and returns false, because the file does not exist (deleted by another JVM already). In addition, release() should give a second attempt to delete() if it fails, since the file may be held temporarily by another process (like AntiVirus) before it fails. The proposed changes are:

1) Use ManagementFactory.getRuntimeMXBean().getName() as part of the test lock name (should include the process Id)
2) In release(), if delete() fails, check if the file indeed exists. If it is, let's attempt a re-delete() few ms later.
3) If (3) still fails, throw an exception. Alternatively, we can attempt a deleteOnExit.

I'll post a patch later today.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


[jira] Commented: (LUCENE-2421) Hardening of NativeFSLock

Posted by "Michael McCandless (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/LUCENE-2421?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12862706#action_12862706 ] 

Michael McCandless commented on LUCENE-2421:
--------------------------------------------

I think 100 msec is way too long -- I was thinking more like 10 msec :)



> Hardening of NativeFSLock
> -------------------------
>
>                 Key: LUCENE-2421
>                 URL: https://issues.apache.org/jira/browse/LUCENE-2421
>             Project: Lucene - Java
>          Issue Type: Improvement
>          Components: Index
>            Reporter: Shai Erera
>            Assignee: Shai Erera
>             Fix For: 3.1
>
>         Attachments: LUCENE-2421.patch
>
>
> NativeFSLock create a test lock file which its name might collide w/ another JVM that is running. Very unlikely, but still it happened a couple of times already, since the tests were parallelized. This may result in a false exception thrown from release(), when the lock file's delete() is called and returns false, because the file does not exist (deleted by another JVM already). In addition, release() should give a second attempt to delete() if it fails, since the file may be held temporarily by another process (like AntiVirus) before it fails. The proposed changes are:
> 1) Use ManagementFactory.getRuntimeMXBean().getName() as part of the test lock name (should include the process Id)
> 2) In release(), if delete() fails, check if the file indeed exists. If it is, let's attempt a re-delete() few ms later.
> 3) If (3) still fails, throw an exception. Alternatively, we can attempt a deleteOnExit.
> I'll post a patch later today.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


[jira] Commented: (LUCENE-2421) Hardening of NativeFSLock

Posted by "Michael McCandless (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/LUCENE-2421?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12863323#action_12863323 ] 

Michael McCandless commented on LUCENE-2421:
--------------------------------------------

Thanks Shai -- this looks great!

> Hardening of NativeFSLock
> -------------------------
>
>                 Key: LUCENE-2421
>                 URL: https://issues.apache.org/jira/browse/LUCENE-2421
>             Project: Lucene - Java
>          Issue Type: Improvement
>          Components: Index
>            Reporter: Shai Erera
>            Assignee: Shai Erera
>             Fix For: 3.1
>
>         Attachments: LUCENE-2421.patch, LUCENE-2421.patch, LUCENE-2421.patch, LUCENE-2421.patch
>
>
> NativeFSLock create a test lock file which its name might collide w/ another JVM that is running. Very unlikely, but still it happened a couple of times already, since the tests were parallelized. This may result in a false exception thrown from release(), when the lock file's delete() is called and returns false, because the file does not exist (deleted by another JVM already). In addition, release() should give a second attempt to delete() if it fails, since the file may be held temporarily by another process (like AntiVirus) before it fails. The proposed changes are:
> 1) Use ManagementFactory.getRuntimeMXBean().getName() as part of the test lock name (should include the process Id)
> 2) In release(), if delete() fails, check if the file indeed exists. If it is, let's attempt a re-delete() few ms later.
> 3) If (3) still fails, throw an exception. Alternatively, we can attempt a deleteOnExit.
> I'll post a patch later today.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


[jira] Updated: (LUCENE-2421) Hardening of NativeFSLock

Posted by "Shai Erera (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/LUCENE-2421?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Shai Erera updated LUCENE-2421:
-------------------------------

    Attachment: LUCENE-2421.patch

Patch adds throwin ThreadInterruptedException.

About using MXBean name, I wrote this code:
{code}
String name = "d:/temp/file-" + ManagementFactory.getRuntimeMXBean().getName();
new File(name).createNewFile();
System.out.println(new File(name).exists());
{code}

The file created included the @ character, which is valid on Windows (and also verified on Linux). But I'm still not sure about using it, because if a system returns a character that is not a valid file character, we may not get the same file name we think ... so maybe if we still want to use it, we should move it to the end of the string, so that the name includes the random number for sure ... what do you think?

> Hardening of NativeFSLock
> -------------------------
>
>                 Key: LUCENE-2421
>                 URL: https://issues.apache.org/jira/browse/LUCENE-2421
>             Project: Lucene - Java
>          Issue Type: Improvement
>          Components: Index
>            Reporter: Shai Erera
>            Assignee: Shai Erera
>             Fix For: 3.1
>
>         Attachments: LUCENE-2421.patch, LUCENE-2421.patch
>
>
> NativeFSLock create a test lock file which its name might collide w/ another JVM that is running. Very unlikely, but still it happened a couple of times already, since the tests were parallelized. This may result in a false exception thrown from release(), when the lock file's delete() is called and returns false, because the file does not exist (deleted by another JVM already). In addition, release() should give a second attempt to delete() if it fails, since the file may be held temporarily by another process (like AntiVirus) before it fails. The proposed changes are:
> 1) Use ManagementFactory.getRuntimeMXBean().getName() as part of the test lock name (should include the process Id)
> 2) In release(), if delete() fails, check if the file indeed exists. If it is, let's attempt a re-delete() few ms later.
> 3) If (3) still fails, throw an exception. Alternatively, we can attempt a deleteOnExit.
> I'll post a patch later today.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


[jira] Commented: (LUCENE-2421) Hardening of NativeFSLock

Posted by "Shai Erera (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/LUCENE-2421?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12862725#action_12862725 ] 

Shai Erera commented on LUCENE-2421:
------------------------------------

Well ... we are talking about a fallback, last chance code, that should hardly ever be executed (i.e., I'm pretty sure the delete() failed b/c the file wasn't there anymore). So if we've reached that situation, I think we should give a reasonable "last chance"? So that if the file can't be deleted since it's locked by AntiVirus, we should be on the safe side?

But I don't have any strong feelings for it - if you think 10 ms are enough, I'll change it.

> Hardening of NativeFSLock
> -------------------------
>
>                 Key: LUCENE-2421
>                 URL: https://issues.apache.org/jira/browse/LUCENE-2421
>             Project: Lucene - Java
>          Issue Type: Improvement
>          Components: Index
>            Reporter: Shai Erera
>            Assignee: Shai Erera
>             Fix For: 3.1
>
>         Attachments: LUCENE-2421.patch
>
>
> NativeFSLock create a test lock file which its name might collide w/ another JVM that is running. Very unlikely, but still it happened a couple of times already, since the tests were parallelized. This may result in a false exception thrown from release(), when the lock file's delete() is called and returns false, because the file does not exist (deleted by another JVM already). In addition, release() should give a second attempt to delete() if it fails, since the file may be held temporarily by another process (like AntiVirus) before it fails. The proposed changes are:
> 1) Use ManagementFactory.getRuntimeMXBean().getName() as part of the test lock name (should include the process Id)
> 2) In release(), if delete() fails, check if the file indeed exists. If it is, let's attempt a re-delete() few ms later.
> 3) If (3) still fails, throw an exception. Alternatively, we can attempt a deleteOnExit.
> I'll post a patch later today.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


[jira] Resolved: (LUCENE-2421) Hardening of NativeFSLock

Posted by "Shai Erera (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/LUCENE-2421?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Shai Erera resolved LUCENE-2421.
--------------------------------

    Resolution: Fixed

Fixed on 3x and trunk

> Hardening of NativeFSLock
> -------------------------
>
>                 Key: LUCENE-2421
>                 URL: https://issues.apache.org/jira/browse/LUCENE-2421
>             Project: Lucene - Java
>          Issue Type: Improvement
>          Components: Index
>            Reporter: Shai Erera
>            Assignee: Shai Erera
>             Fix For: 3.1, 4.0, 3.0.2, 2.9.3
>
>         Attachments: LUCENE-2421-2.patch, LUCENE-2421.patch, LUCENE-2421.patch, LUCENE-2421.patch, LUCENE-2421.patch
>
>
> NativeFSLock create a test lock file which its name might collide w/ another JVM that is running. Very unlikely, but still it happened a couple of times already, since the tests were parallelized. This may result in a false exception thrown from release(), when the lock file's delete() is called and returns false, because the file does not exist (deleted by another JVM already). In addition, release() should give a second attempt to delete() if it fails, since the file may be held temporarily by another process (like AntiVirus) before it fails. The proposed changes are:
> 1) Use ManagementFactory.getRuntimeMXBean().getName() as part of the test lock name (should include the process Id)
> 2) In release(), if delete() fails, check if the file indeed exists. If it is, let's attempt a re-delete() few ms later.
> 3) If (3) still fails, throw an exception. Alternatively, we can attempt a deleteOnExit.
> I'll post a patch later today.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


[jira] Updated: (LUCENE-2421) Hardening of NativeFSLock

Posted by "Shai Erera (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/LUCENE-2421?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Shai Erera updated LUCENE-2421:
-------------------------------

    Attachment: LUCENE-2421.patch

Patch includes:
* Not throwing exception if the delete() fails
* Added test case to TestLockFactory
* Record in Runtime Behavior in CHANGES

> Hardening of NativeFSLock
> -------------------------
>
>                 Key: LUCENE-2421
>                 URL: https://issues.apache.org/jira/browse/LUCENE-2421
>             Project: Lucene - Java
>          Issue Type: Improvement
>          Components: Index
>            Reporter: Shai Erera
>            Assignee: Shai Erera
>             Fix For: 3.1
>
>         Attachments: LUCENE-2421.patch, LUCENE-2421.patch, LUCENE-2421.patch
>
>
> NativeFSLock create a test lock file which its name might collide w/ another JVM that is running. Very unlikely, but still it happened a couple of times already, since the tests were parallelized. This may result in a false exception thrown from release(), when the lock file's delete() is called and returns false, because the file does not exist (deleted by another JVM already). In addition, release() should give a second attempt to delete() if it fails, since the file may be held temporarily by another process (like AntiVirus) before it fails. The proposed changes are:
> 1) Use ManagementFactory.getRuntimeMXBean().getName() as part of the test lock name (should include the process Id)
> 2) In release(), if delete() fails, check if the file indeed exists. If it is, let's attempt a re-delete() few ms later.
> 3) If (3) still fails, throw an exception. Alternatively, we can attempt a deleteOnExit.
> I'll post a patch later today.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


[jira] Reopened: (LUCENE-2421) Hardening of NativeFSLock

Posted by "Michael McCandless (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/LUCENE-2421?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Michael McCandless reopened LUCENE-2421:
----------------------------------------


backport

> Hardening of NativeFSLock
> -------------------------
>
>                 Key: LUCENE-2421
>                 URL: https://issues.apache.org/jira/browse/LUCENE-2421
>             Project: Lucene - Java
>          Issue Type: Improvement
>          Components: Index
>            Reporter: Shai Erera
>            Assignee: Shai Erera
>             Fix For: 2.9.3, 3.0.2, 3.1, 4.0
>
>         Attachments: LUCENE-2421.patch, LUCENE-2421.patch, LUCENE-2421.patch, LUCENE-2421.patch
>
>
> NativeFSLock create a test lock file which its name might collide w/ another JVM that is running. Very unlikely, but still it happened a couple of times already, since the tests were parallelized. This may result in a false exception thrown from release(), when the lock file's delete() is called and returns false, because the file does not exist (deleted by another JVM already). In addition, release() should give a second attempt to delete() if it fails, since the file may be held temporarily by another process (like AntiVirus) before it fails. The proposed changes are:
> 1) Use ManagementFactory.getRuntimeMXBean().getName() as part of the test lock name (should include the process Id)
> 2) In release(), if delete() fails, check if the file indeed exists. If it is, let's attempt a re-delete() few ms later.
> 3) If (3) still fails, throw an exception. Alternatively, we can attempt a deleteOnExit.
> I'll post a patch later today.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


[jira] Commented: (LUCENE-2421) Hardening of NativeFSLock

Posted by "Michael McCandless (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/LUCENE-2421?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12918019#action_12918019 ] 

Michael McCandless commented on LUCENE-2421:
--------------------------------------------

So, I'm confused on how/why the *test* lock name would get a conflict
with multiple JREs running our unit tests.

Ie, each JRE runs tests in certain index directories.  And the test
lock is acquired in the index directory.  So, if tests running in
different JREs are sharing the same index directory, then more serious
problems will ensue.

I think it must have only been the test output formatter
(LuceneJUnitResultFormatter) that was conflicting on the test lock
name.  Because this is the only locking usage that'd be sharing the
same lock dir across JREs...


> Hardening of NativeFSLock
> -------------------------
>
>                 Key: LUCENE-2421
>                 URL: https://issues.apache.org/jira/browse/LUCENE-2421
>             Project: Lucene - Java
>          Issue Type: Improvement
>          Components: Index
>            Reporter: Shai Erera
>            Assignee: Shai Erera
>             Fix For: 2.9.3, 3.0.2, 3.1, 4.0
>
>         Attachments: LUCENE-2421-2.patch, LUCENE-2421.patch, LUCENE-2421.patch, LUCENE-2421.patch, LUCENE-2421.patch
>
>
> NativeFSLock create a test lock file which its name might collide w/ another JVM that is running. Very unlikely, but still it happened a couple of times already, since the tests were parallelized. This may result in a false exception thrown from release(), when the lock file's delete() is called and returns false, because the file does not exist (deleted by another JVM already). In addition, release() should give a second attempt to delete() if it fails, since the file may be held temporarily by another process (like AntiVirus) before it fails. The proposed changes are:
> 1) Use ManagementFactory.getRuntimeMXBean().getName() as part of the test lock name (should include the process Id)
> 2) In release(), if delete() fails, check if the file indeed exists. If it is, let's attempt a re-delete() few ms later.
> 3) If (3) still fails, throw an exception. Alternatively, we can attempt a deleteOnExit.
> I'll post a patch later today.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


[jira] Updated: (LUCENE-2421) Hardening of NativeFSLock

Posted by "Michael McCandless (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/LUCENE-2421?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Michael McCandless updated LUCENE-2421:
---------------------------------------

    Fix Version/s: 2.9.3
                   3.0.2

> Hardening of NativeFSLock
> -------------------------
>
>                 Key: LUCENE-2421
>                 URL: https://issues.apache.org/jira/browse/LUCENE-2421
>             Project: Lucene - Java
>          Issue Type: Improvement
>          Components: Index
>            Reporter: Shai Erera
>            Assignee: Shai Erera
>             Fix For: 2.9.3, 3.0.2, 3.1, 4.0
>
>         Attachments: LUCENE-2421.patch, LUCENE-2421.patch, LUCENE-2421.patch, LUCENE-2421.patch
>
>
> NativeFSLock create a test lock file which its name might collide w/ another JVM that is running. Very unlikely, but still it happened a couple of times already, since the tests were parallelized. This may result in a false exception thrown from release(), when the lock file's delete() is called and returns false, because the file does not exist (deleted by another JVM already). In addition, release() should give a second attempt to delete() if it fails, since the file may be held temporarily by another process (like AntiVirus) before it fails. The proposed changes are:
> 1) Use ManagementFactory.getRuntimeMXBean().getName() as part of the test lock name (should include the process Id)
> 2) In release(), if delete() fails, check if the file indeed exists. If it is, let's attempt a re-delete() few ms later.
> 3) If (3) still fails, throw an exception. Alternatively, we can attempt a deleteOnExit.
> I'll post a patch later today.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


[jira] Commented: (LUCENE-2421) Hardening of NativeFSLock

Posted by "Mark Miller (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/LUCENE-2421?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12863011#action_12863011 ] 

Mark Miller commented on LUCENE-2421:
-------------------------------------

I'm still confused on the native/simple working together thing - I don't see where native will respect simple's lock? Am I missing something?

Also, its almost more confusing that simple will respect the native lock file, when that file is not even native's actual lock - just its medium for a lock. If they could really work together, why even have a native lock?

> Hardening of NativeFSLock
> -------------------------
>
>                 Key: LUCENE-2421
>                 URL: https://issues.apache.org/jira/browse/LUCENE-2421
>             Project: Lucene - Java
>          Issue Type: Improvement
>          Components: Index
>            Reporter: Shai Erera
>            Assignee: Shai Erera
>             Fix For: 3.1
>
>         Attachments: LUCENE-2421.patch, LUCENE-2421.patch
>
>
> NativeFSLock create a test lock file which its name might collide w/ another JVM that is running. Very unlikely, but still it happened a couple of times already, since the tests were parallelized. This may result in a false exception thrown from release(), when the lock file's delete() is called and returns false, because the file does not exist (deleted by another JVM already). In addition, release() should give a second attempt to delete() if it fails, since the file may be held temporarily by another process (like AntiVirus) before it fails. The proposed changes are:
> 1) Use ManagementFactory.getRuntimeMXBean().getName() as part of the test lock name (should include the process Id)
> 2) In release(), if delete() fails, check if the file indeed exists. If it is, let's attempt a re-delete() few ms later.
> 3) If (3) still fails, throw an exception. Alternatively, we can attempt a deleteOnExit.
> I'll post a patch later today.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


[jira] Commented: (LUCENE-2421) Hardening of NativeFSLock

Posted by "Shai Erera (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/LUCENE-2421?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12862967#action_12862967 ] 

Shai Erera commented on LUCENE-2421:
------------------------------------

I guess that you've missed that on the thread: "_It is possible that two JVMs will attempt to lock the same Directory, one w/ Native and the other w/ Simple. If we won't check in obtain() whether the file exists, it might obtain a native lock, while the Directory is actually locked by another JVM using Simple_". Uwe also mentioned Native was fixed to use the same lock file name in 2.9 because of that.

Another thing why we cannot leave the lock file behind is because if you e.g. switch from Native to Simple you won't be able to obtain a lock.

And personally I prefer that if the Directory is not locked then the file won't be there - even if just for clarity, or because how we've all become used to treat the existence of the lock file by now. And I'd also hate to add another line to bw section :)

> Hardening of NativeFSLock
> -------------------------
>
>                 Key: LUCENE-2421
>                 URL: https://issues.apache.org/jira/browse/LUCENE-2421
>             Project: Lucene - Java
>          Issue Type: Improvement
>          Components: Index
>            Reporter: Shai Erera
>            Assignee: Shai Erera
>             Fix For: 3.1
>
>         Attachments: LUCENE-2421.patch
>
>
> NativeFSLock create a test lock file which its name might collide w/ another JVM that is running. Very unlikely, but still it happened a couple of times already, since the tests were parallelized. This may result in a false exception thrown from release(), when the lock file's delete() is called and returns false, because the file does not exist (deleted by another JVM already). In addition, release() should give a second attempt to delete() if it fails, since the file may be held temporarily by another process (like AntiVirus) before it fails. The proposed changes are:
> 1) Use ManagementFactory.getRuntimeMXBean().getName() as part of the test lock name (should include the process Id)
> 2) In release(), if delete() fails, check if the file indeed exists. If it is, let's attempt a re-delete() few ms later.
> 3) If (3) still fails, throw an exception. Alternatively, we can attempt a deleteOnExit.
> I'll post a patch later today.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


[jira] Commented: (LUCENE-2421) Hardening of NativeFSLock

Posted by "Uwe Schindler (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/LUCENE-2421?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12863316#action_12863316 ] 

Uwe Schindler commented on LUCENE-2421:
---------------------------------------

Looks good!

> Hardening of NativeFSLock
> -------------------------
>
>                 Key: LUCENE-2421
>                 URL: https://issues.apache.org/jira/browse/LUCENE-2421
>             Project: Lucene - Java
>          Issue Type: Improvement
>          Components: Index
>            Reporter: Shai Erera
>            Assignee: Shai Erera
>             Fix For: 3.1
>
>         Attachments: LUCENE-2421.patch, LUCENE-2421.patch, LUCENE-2421.patch, LUCENE-2421.patch
>
>
> NativeFSLock create a test lock file which its name might collide w/ another JVM that is running. Very unlikely, but still it happened a couple of times already, since the tests were parallelized. This may result in a false exception thrown from release(), when the lock file's delete() is called and returns false, because the file does not exist (deleted by another JVM already). In addition, release() should give a second attempt to delete() if it fails, since the file may be held temporarily by another process (like AntiVirus) before it fails. The proposed changes are:
> 1) Use ManagementFactory.getRuntimeMXBean().getName() as part of the test lock name (should include the process Id)
> 2) In release(), if delete() fails, check if the file indeed exists. If it is, let's attempt a re-delete() few ms later.
> 3) If (3) still fails, throw an exception. Alternatively, we can attempt a deleteOnExit.
> I'll post a patch later today.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


[jira] Issue Comment Edited: (LUCENE-2421) Hardening of NativeFSLock

Posted by "Uwe Schindler (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/LUCENE-2421?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12862994#action_12862994 ] 

Uwe Schindler edited comment on LUCENE-2421 at 5/1/10 3:51 AM:
---------------------------------------------------------------

bq. The file created included the @ character, which is valid on Windows (and also verified on Linux). But I'm still not sure about using it, because if a system returns a character that is not a valid file character, we may not get the same file name we think ... so maybe if we still want to use it, we should move it to the end of the string, so that the name includes the random number for sure ... what do you think?

I suggested on IRC to take the output of getName() and do a regexp replace and replace all non ACSII-digit chars:
{code}
ManagementFactory.getRuntimeMXBean().getName().replaceAll("[^a..zA..Z0..9]+","")
{code}
Or alternatively simply use the hashCode of that String (but that is less unique).

      was (Author: thetaphi):
    bq. The file created included the @ character, which is valid on Windows (and also verified on Linux). But I'm still not sure about using it, because if a system returns a character that is not a valid file character, we may not get the same file name we think ... so maybe if we still want to use it, we should move it to the end of the string, so that the name includes the random number for sure ... what do you think?

I suggested on IRC to take the output of getName() and do a regexp replace and replace all non ACSII-digit chars:
{code}
ManagementFactory.getRuntimeMXBean().getName().replaceAll("[^a..zA..Z0..9]","")
{code}
Or alternatively simply use the hashCode of that String (but that is less unique).
  
> Hardening of NativeFSLock
> -------------------------
>
>                 Key: LUCENE-2421
>                 URL: https://issues.apache.org/jira/browse/LUCENE-2421
>             Project: Lucene - Java
>          Issue Type: Improvement
>          Components: Index
>            Reporter: Shai Erera
>            Assignee: Shai Erera
>             Fix For: 3.1
>
>         Attachments: LUCENE-2421.patch, LUCENE-2421.patch
>
>
> NativeFSLock create a test lock file which its name might collide w/ another JVM that is running. Very unlikely, but still it happened a couple of times already, since the tests were parallelized. This may result in a false exception thrown from release(), when the lock file's delete() is called and returns false, because the file does not exist (deleted by another JVM already). In addition, release() should give a second attempt to delete() if it fails, since the file may be held temporarily by another process (like AntiVirus) before it fails. The proposed changes are:
> 1) Use ManagementFactory.getRuntimeMXBean().getName() as part of the test lock name (should include the process Id)
> 2) In release(), if delete() fails, check if the file indeed exists. If it is, let's attempt a re-delete() few ms later.
> 3) If (3) still fails, throw an exception. Alternatively, we can attempt a deleteOnExit.
> I'll post a patch later today.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


[jira] Commented: (LUCENE-2421) Hardening of NativeFSLock

Posted by "Shai Erera (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/LUCENE-2421?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12863288#action_12863288 ] 

Shai Erera commented on LUCENE-2421:
------------------------------------

That's just so that we "behave nicely" with the application and not leave behind the lock file. I'm thinking that if you see a write.lock file in the directory, it makes you wonder why it's there, and if the index was closed properly or not etc. The existence of the file does not tell you anything about which LF is used (Native or Simple) and so you cannot know for sure what went wrong. Going the extra mile, even if it's only in rare cases, seems harmless, no?

I don't think it's critical, but I also don't think the code complicates matters?

> Hardening of NativeFSLock
> -------------------------
>
>                 Key: LUCENE-2421
>                 URL: https://issues.apache.org/jira/browse/LUCENE-2421
>             Project: Lucene - Java
>          Issue Type: Improvement
>          Components: Index
>            Reporter: Shai Erera
>            Assignee: Shai Erera
>             Fix For: 3.1
>
>         Attachments: LUCENE-2421.patch, LUCENE-2421.patch, LUCENE-2421.patch
>
>
> NativeFSLock create a test lock file which its name might collide w/ another JVM that is running. Very unlikely, but still it happened a couple of times already, since the tests were parallelized. This may result in a false exception thrown from release(), when the lock file's delete() is called and returns false, because the file does not exist (deleted by another JVM already). In addition, release() should give a second attempt to delete() if it fails, since the file may be held temporarily by another process (like AntiVirus) before it fails. The proposed changes are:
> 1) Use ManagementFactory.getRuntimeMXBean().getName() as part of the test lock name (should include the process Id)
> 2) In release(), if delete() fails, check if the file indeed exists. If it is, let's attempt a re-delete() few ms later.
> 3) If (3) still fails, throw an exception. Alternatively, we can attempt a deleteOnExit.
> I'll post a patch later today.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


[jira] Commented: (LUCENE-2421) Hardening of NativeFSLock

Posted by "Shai Erera (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/LUCENE-2421?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12862805#action_12862805 ] 

Shai Erera commented on LUCENE-2421:
------------------------------------

bq.  Libraries shouldn't be doing things like sleeping on you (for so long)...

Well ... I think they'd appreciate the sleep instead of the immediate exception. Previuosly, I've been asked by customers to do that in other places. A 100ms sleep is barely noticeable (being so rare), but an exception definitely is! :)

bq. Isn't it "harmless" if we can't delete it?

No it isn't. I distinguish between two cases: the regular and test lock. For the regular lock, we've already agreed we cannot leave it behind as it will prevent future locking. For the test lock I catch the exception in acquireTestLock and call deleteOnExit() suppressing the exception.

bq. you should immediately throw ThreadInterruptedException

ok, thought it'd be harmless here to ignore it since I'm only doing last resort actions. But if that's common practice now, I will throw the exception.

Thanks for the comments !

> Hardening of NativeFSLock
> -------------------------
>
>                 Key: LUCENE-2421
>                 URL: https://issues.apache.org/jira/browse/LUCENE-2421
>             Project: Lucene - Java
>          Issue Type: Improvement
>          Components: Index
>            Reporter: Shai Erera
>            Assignee: Shai Erera
>             Fix For: 3.1
>
>         Attachments: LUCENE-2421.patch
>
>
> NativeFSLock create a test lock file which its name might collide w/ another JVM that is running. Very unlikely, but still it happened a couple of times already, since the tests were parallelized. This may result in a false exception thrown from release(), when the lock file's delete() is called and returns false, because the file does not exist (deleted by another JVM already). In addition, release() should give a second attempt to delete() if it fails, since the file may be held temporarily by another process (like AntiVirus) before it fails. The proposed changes are:
> 1) Use ManagementFactory.getRuntimeMXBean().getName() as part of the test lock name (should include the process Id)
> 2) In release(), if delete() fails, check if the file indeed exists. If it is, let's attempt a re-delete() few ms later.
> 3) If (3) still fails, throw an exception. Alternatively, we can attempt a deleteOnExit.
> I'll post a patch later today.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


[jira] Commented: (LUCENE-2421) Hardening of NativeFSLock

Posted by "Michael McCandless (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/LUCENE-2421?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12918030#action_12918030 ] 

Michael McCandless commented on LUCENE-2421:
--------------------------------------------

I'll open a new issue, to remove acquireTestLock... all tests pass with it removed.

> Hardening of NativeFSLock
> -------------------------
>
>                 Key: LUCENE-2421
>                 URL: https://issues.apache.org/jira/browse/LUCENE-2421
>             Project: Lucene - Java
>          Issue Type: Improvement
>          Components: Index
>            Reporter: Shai Erera
>            Assignee: Shai Erera
>             Fix For: 2.9.3, 3.0.2, 3.1, 4.0
>
>         Attachments: LUCENE-2421-2.patch, LUCENE-2421.patch, LUCENE-2421.patch, LUCENE-2421.patch, LUCENE-2421.patch
>
>
> NativeFSLock create a test lock file which its name might collide w/ another JVM that is running. Very unlikely, but still it happened a couple of times already, since the tests were parallelized. This may result in a false exception thrown from release(), when the lock file's delete() is called and returns false, because the file does not exist (deleted by another JVM already). In addition, release() should give a second attempt to delete() if it fails, since the file may be held temporarily by another process (like AntiVirus) before it fails. The proposed changes are:
> 1) Use ManagementFactory.getRuntimeMXBean().getName() as part of the test lock name (should include the process Id)
> 2) In release(), if delete() fails, check if the file indeed exists. If it is, let's attempt a re-delete() few ms later.
> 3) If (3) still fails, throw an exception. Alternatively, we can attempt a deleteOnExit.
> I'll post a patch later today.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


[jira] Commented: (LUCENE-2421) Hardening of NativeFSLock

Posted by "Mark Miller (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/LUCENE-2421?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12862969#action_12862969 ] 

Mark Miller commented on LUCENE-2421:
-------------------------------------

bq. I guess that you've missed that on the thread

No, just havn't seen a good reason yet...

{quote} "It is possible that two JVMs will attempt to lock the same Directory, one w/ Native and the other w/ Simple. If we won't check in obtain() whether the file exists, it might obtain a native lock, while the Directory is actually locked by another JVM using Simple". Uwe also mentioned Native was fixed to use the same lock file name in 2.9 because of that.{quote}

Lock factories do not have to work with all other lock factories...you need to use the same lock factory across all process'.

bq. Another thing why we cannot leave the lock file behind is because if you e.g. switch from Native to Simple you won't be able to obtain a lock.

Not true - there is no reason both simple and native need to use the same lock file - or even that the native lock feel needs to be in the same dir (eg it could be in a tmp dir)

bq. And personally I prefer that if the Directory is not locked then the file won't be there 

You cannot guarantee this in any case. If they cannot be deleted, it cannot be deleted - but cleanliness is no reason to throw an exception in an event that is not actually a failure situation.

bq. even if just for clarity, or because how we've all become used to treat the existence of the lock file by now.

I don't find that persuasive myself...

bq. And I'd also hate to add another line to bw section

but its not a back compat break ...

> Hardening of NativeFSLock
> -------------------------
>
>                 Key: LUCENE-2421
>                 URL: https://issues.apache.org/jira/browse/LUCENE-2421
>             Project: Lucene - Java
>          Issue Type: Improvement
>          Components: Index
>            Reporter: Shai Erera
>            Assignee: Shai Erera
>             Fix For: 3.1
>
>         Attachments: LUCENE-2421.patch, LUCENE-2421.patch
>
>
> NativeFSLock create a test lock file which its name might collide w/ another JVM that is running. Very unlikely, but still it happened a couple of times already, since the tests were parallelized. This may result in a false exception thrown from release(), when the lock file's delete() is called and returns false, because the file does not exist (deleted by another JVM already). In addition, release() should give a second attempt to delete() if it fails, since the file may be held temporarily by another process (like AntiVirus) before it fails. The proposed changes are:
> 1) Use ManagementFactory.getRuntimeMXBean().getName() as part of the test lock name (should include the process Id)
> 2) In release(), if delete() fails, check if the file indeed exists. If it is, let's attempt a re-delete() few ms later.
> 3) If (3) still fails, throw an exception. Alternatively, we can attempt a deleteOnExit.
> I'll post a patch later today.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


[jira] Resolved: (LUCENE-2421) Hardening of NativeFSLock

Posted by "Shai Erera (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/LUCENE-2421?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Shai Erera resolved LUCENE-2421.
--------------------------------

    Resolution: Fixed

Committed revision 940730.

> Hardening of NativeFSLock
> -------------------------
>
>                 Key: LUCENE-2421
>                 URL: https://issues.apache.org/jira/browse/LUCENE-2421
>             Project: Lucene - Java
>          Issue Type: Improvement
>          Components: Index
>            Reporter: Shai Erera
>            Assignee: Shai Erera
>             Fix For: 3.1
>
>         Attachments: LUCENE-2421.patch, LUCENE-2421.patch, LUCENE-2421.patch, LUCENE-2421.patch
>
>
> NativeFSLock create a test lock file which its name might collide w/ another JVM that is running. Very unlikely, but still it happened a couple of times already, since the tests were parallelized. This may result in a false exception thrown from release(), when the lock file's delete() is called and returns false, because the file does not exist (deleted by another JVM already). In addition, release() should give a second attempt to delete() if it fails, since the file may be held temporarily by another process (like AntiVirus) before it fails. The proposed changes are:
> 1) Use ManagementFactory.getRuntimeMXBean().getName() as part of the test lock name (should include the process Id)
> 2) In release(), if delete() fails, check if the file indeed exists. If it is, let's attempt a re-delete() few ms later.
> 3) If (3) still fails, throw an exception. Alternatively, we can attempt a deleteOnExit.
> I'll post a patch later today.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


[jira] Commented: (LUCENE-2421) Hardening of NativeFSLock

Posted by "Shai Erera (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/LUCENE-2421?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12918020#action_12918020 ] 

Shai Erera commented on LUCENE-2421:
------------------------------------

>From what I remember, it was due to LuceneJUnitResultFormatter, yes.

I don't mind if we stop using j.l.management and move to either:
* Seed the Random w/ System.nanoTime() or
* Don't acquire a test lock.

I don't remember if I ever tried the first approach. Your comment on LUCENE-2688 about not acquiring the test lock makes sense to me, but some testing would be required to confirm that I guess.

> Hardening of NativeFSLock
> -------------------------
>
>                 Key: LUCENE-2421
>                 URL: https://issues.apache.org/jira/browse/LUCENE-2421
>             Project: Lucene - Java
>          Issue Type: Improvement
>          Components: Index
>            Reporter: Shai Erera
>            Assignee: Shai Erera
>             Fix For: 2.9.3, 3.0.2, 3.1, 4.0
>
>         Attachments: LUCENE-2421-2.patch, LUCENE-2421.patch, LUCENE-2421.patch, LUCENE-2421.patch, LUCENE-2421.patch
>
>
> NativeFSLock create a test lock file which its name might collide w/ another JVM that is running. Very unlikely, but still it happened a couple of times already, since the tests were parallelized. This may result in a false exception thrown from release(), when the lock file's delete() is called and returns false, because the file does not exist (deleted by another JVM already). In addition, release() should give a second attempt to delete() if it fails, since the file may be held temporarily by another process (like AntiVirus) before it fails. The proposed changes are:
> 1) Use ManagementFactory.getRuntimeMXBean().getName() as part of the test lock name (should include the process Id)
> 2) In release(), if delete() fails, check if the file indeed exists. If it is, let's attempt a re-delete() few ms later.
> 3) If (3) still fails, throw an exception. Alternatively, we can attempt a deleteOnExit.
> I'll post a patch later today.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


[jira] Updated: (LUCENE-2421) Hardening of NativeFSLock

Posted by "Shai Erera (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/LUCENE-2421?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Shai Erera updated LUCENE-2421:
-------------------------------

    Attachment: LUCENE-2421.patch

Ok, I'm convinced - let's design for today. Removed the "last chance" code.

> Hardening of NativeFSLock
> -------------------------
>
>                 Key: LUCENE-2421
>                 URL: https://issues.apache.org/jira/browse/LUCENE-2421
>             Project: Lucene - Java
>          Issue Type: Improvement
>          Components: Index
>            Reporter: Shai Erera
>            Assignee: Shai Erera
>             Fix For: 3.1
>
>         Attachments: LUCENE-2421.patch, LUCENE-2421.patch, LUCENE-2421.patch, LUCENE-2421.patch
>
>
> NativeFSLock create a test lock file which its name might collide w/ another JVM that is running. Very unlikely, but still it happened a couple of times already, since the tests were parallelized. This may result in a false exception thrown from release(), when the lock file's delete() is called and returns false, because the file does not exist (deleted by another JVM already). In addition, release() should give a second attempt to delete() if it fails, since the file may be held temporarily by another process (like AntiVirus) before it fails. The proposed changes are:
> 1) Use ManagementFactory.getRuntimeMXBean().getName() as part of the test lock name (should include the process Id)
> 2) In release(), if delete() fails, check if the file indeed exists. If it is, let's attempt a re-delete() few ms later.
> 3) If (3) still fails, throw an exception. Alternatively, we can attempt a deleteOnExit.
> I'll post a patch later today.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


[jira] Commented: (LUCENE-2421) Hardening of NativeFSLock

Posted by "Michael McCandless (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/LUCENE-2421?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12862834#action_12862834 ] 

Michael McCandless commented on LUCENE-2421:
--------------------------------------------

bq. Well ... I think they'd appreciate the sleep instead of the immediate exception. Previuosly, I've been asked by customers to do that in other places. A 100ms sleep is barely noticeable (being so rare), but an exception definitely is

Yeah I agree we should try to be robust to the "diverse" installations out there, like AVs that tie up our files.

bq. For the regular lock, we've already agreed we cannot leave it behind as it will prevent future locking

Wait, how come?  (SimpleFSLF does this, but native gets a "real" lock).

bq.  For the test lock I catch the exception in acquireTestLock and call deleteOnExit() suppressing the exception.

OK.

> Hardening of NativeFSLock
> -------------------------
>
>                 Key: LUCENE-2421
>                 URL: https://issues.apache.org/jira/browse/LUCENE-2421
>             Project: Lucene - Java
>          Issue Type: Improvement
>          Components: Index
>            Reporter: Shai Erera
>            Assignee: Shai Erera
>             Fix For: 3.1
>
>         Attachments: LUCENE-2421.patch
>
>
> NativeFSLock create a test lock file which its name might collide w/ another JVM that is running. Very unlikely, but still it happened a couple of times already, since the tests were parallelized. This may result in a false exception thrown from release(), when the lock file's delete() is called and returns false, because the file does not exist (deleted by another JVM already). In addition, release() should give a second attempt to delete() if it fails, since the file may be held temporarily by another process (like AntiVirus) before it fails. The proposed changes are:
> 1) Use ManagementFactory.getRuntimeMXBean().getName() as part of the test lock name (should include the process Id)
> 2) In release(), if delete() fails, check if the file indeed exists. If it is, let's attempt a re-delete() few ms later.
> 3) If (3) still fails, throw an exception. Alternatively, we can attempt a deleteOnExit.
> I'll post a patch later today.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


[jira] Updated: (LUCENE-2421) Hardening of NativeFSLock

Posted by "Shai Erera (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/LUCENE-2421?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Shai Erera updated LUCENE-2421:
-------------------------------

    Fix Version/s: 3.1

> Hardening of NativeFSLock
> -------------------------
>
>                 Key: LUCENE-2421
>                 URL: https://issues.apache.org/jira/browse/LUCENE-2421
>             Project: Lucene - Java
>          Issue Type: Improvement
>          Components: Index
>            Reporter: Shai Erera
>            Assignee: Shai Erera
>             Fix For: 3.1, 4.0
>
>         Attachments: LUCENE-2421.patch, LUCENE-2421.patch, LUCENE-2421.patch, LUCENE-2421.patch
>
>
> NativeFSLock create a test lock file which its name might collide w/ another JVM that is running. Very unlikely, but still it happened a couple of times already, since the tests were parallelized. This may result in a false exception thrown from release(), when the lock file's delete() is called and returns false, because the file does not exist (deleted by another JVM already). In addition, release() should give a second attempt to delete() if it fails, since the file may be held temporarily by another process (like AntiVirus) before it fails. The proposed changes are:
> 1) Use ManagementFactory.getRuntimeMXBean().getName() as part of the test lock name (should include the process Id)
> 2) In release(), if delete() fails, check if the file indeed exists. If it is, let's attempt a re-delete() few ms later.
> 3) If (3) still fails, throw an exception. Alternatively, we can attempt a deleteOnExit.
> I'll post a patch later today.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


[jira] Resolved: (LUCENE-2421) Hardening of NativeFSLock

Posted by "Michael McCandless (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/LUCENE-2421?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Michael McCandless resolved LUCENE-2421.
----------------------------------------

    Resolution: Fixed

> Hardening of NativeFSLock
> -------------------------
>
>                 Key: LUCENE-2421
>                 URL: https://issues.apache.org/jira/browse/LUCENE-2421
>             Project: Lucene - Java
>          Issue Type: Improvement
>          Components: Index
>            Reporter: Shai Erera
>            Assignee: Shai Erera
>             Fix For: 2.9.3, 3.0.2, 3.1, 4.0
>
>         Attachments: LUCENE-2421.patch, LUCENE-2421.patch, LUCENE-2421.patch, LUCENE-2421.patch
>
>
> NativeFSLock create a test lock file which its name might collide w/ another JVM that is running. Very unlikely, but still it happened a couple of times already, since the tests were parallelized. This may result in a false exception thrown from release(), when the lock file's delete() is called and returns false, because the file does not exist (deleted by another JVM already). In addition, release() should give a second attempt to delete() if it fails, since the file may be held temporarily by another process (like AntiVirus) before it fails. The proposed changes are:
> 1) Use ManagementFactory.getRuntimeMXBean().getName() as part of the test lock name (should include the process Id)
> 2) In release(), if delete() fails, check if the file indeed exists. If it is, let's attempt a re-delete() few ms later.
> 3) If (3) still fails, throw an exception. Alternatively, we can attempt a deleteOnExit.
> I'll post a patch later today.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


[jira] Commented: (LUCENE-2421) Hardening of NativeFSLock

Posted by "Uwe Schindler (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/LUCENE-2421?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12862994#action_12862994 ] 

Uwe Schindler commented on LUCENE-2421:
---------------------------------------

bq. The file created included the @ character, which is valid on Windows (and also verified on Linux). But I'm still not sure about using it, because if a system returns a character that is not a valid file character, we may not get the same file name we think ... so maybe if we still want to use it, we should move it to the end of the string, so that the name includes the random number for sure ... what do you think?

I suggested on IRC to take the output of getName() and do a regexp replace and replace all non ACSII-digit chars:
{code}
ManagementFactory.getRuntimeMXBean().getName().replaceAll("[^a..zA..Z0..9]","")
{code}
Or alternatively simply use the hashCode of that String (but that is less unique).

> Hardening of NativeFSLock
> -------------------------
>
>                 Key: LUCENE-2421
>                 URL: https://issues.apache.org/jira/browse/LUCENE-2421
>             Project: Lucene - Java
>          Issue Type: Improvement
>          Components: Index
>            Reporter: Shai Erera
>            Assignee: Shai Erera
>             Fix For: 3.1
>
>         Attachments: LUCENE-2421.patch, LUCENE-2421.patch
>
>
> NativeFSLock create a test lock file which its name might collide w/ another JVM that is running. Very unlikely, but still it happened a couple of times already, since the tests were parallelized. This may result in a false exception thrown from release(), when the lock file's delete() is called and returns false, because the file does not exist (deleted by another JVM already). In addition, release() should give a second attempt to delete() if it fails, since the file may be held temporarily by another process (like AntiVirus) before it fails. The proposed changes are:
> 1) Use ManagementFactory.getRuntimeMXBean().getName() as part of the test lock name (should include the process Id)
> 2) In release(), if delete() fails, check if the file indeed exists. If it is, let's attempt a re-delete() few ms later.
> 3) If (3) still fails, throw an exception. Alternatively, we can attempt a deleteOnExit.
> I'll post a patch later today.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


[jira] Reopened: (LUCENE-2421) Hardening of NativeFSLock

Posted by "Shai Erera (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/LUCENE-2421?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Shai Erera reopened LUCENE-2421:
--------------------------------


Reopening to handle clearLock. Patch will be posted soon.

> Hardening of NativeFSLock
> -------------------------
>
>                 Key: LUCENE-2421
>                 URL: https://issues.apache.org/jira/browse/LUCENE-2421
>             Project: Lucene - Java
>          Issue Type: Improvement
>          Components: Index
>            Reporter: Shai Erera
>            Assignee: Shai Erera
>             Fix For: 2.9.3, 3.0.2, 3.1, 4.0
>
>         Attachments: LUCENE-2421.patch, LUCENE-2421.patch, LUCENE-2421.patch, LUCENE-2421.patch
>
>
> NativeFSLock create a test lock file which its name might collide w/ another JVM that is running. Very unlikely, but still it happened a couple of times already, since the tests were parallelized. This may result in a false exception thrown from release(), when the lock file's delete() is called and returns false, because the file does not exist (deleted by another JVM already). In addition, release() should give a second attempt to delete() if it fails, since the file may be held temporarily by another process (like AntiVirus) before it fails. The proposed changes are:
> 1) Use ManagementFactory.getRuntimeMXBean().getName() as part of the test lock name (should include the process Id)
> 2) In release(), if delete() fails, check if the file indeed exists. If it is, let's attempt a re-delete() few ms later.
> 3) If (3) still fails, throw an exception. Alternatively, we can attempt a deleteOnExit.
> I'll post a patch later today.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


[jira] Commented: (LUCENE-2421) Hardening of NativeFSLock

Posted by "Michael McCandless (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/LUCENE-2421?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12862735#action_12862735 ] 

Michael McCandless commented on LUCENE-2421:
--------------------------------------------

I guess since it's a last resort "we think will very rarely happen in practice" thing... we can leave it @ 100 msec.  Just makes me nervous :)  Libraries shouldn't be doing things like sleeping on you (for so long)...

Hmm -- after that sleep, we still throw LockReleaseFailedException if we are still unable to delete it?  Isn't it "harmless" if we can't delete it?

Also, rather than swallowing the InterruptedException, you should immediately throw ThreadInterruptedException -- we fixed Lucene a while back to not swallow this exception (LUCENE-2053).

> Hardening of NativeFSLock
> -------------------------
>
>                 Key: LUCENE-2421
>                 URL: https://issues.apache.org/jira/browse/LUCENE-2421
>             Project: Lucene - Java
>          Issue Type: Improvement
>          Components: Index
>            Reporter: Shai Erera
>            Assignee: Shai Erera
>             Fix For: 3.1
>
>         Attachments: LUCENE-2421.patch
>
>
> NativeFSLock create a test lock file which its name might collide w/ another JVM that is running. Very unlikely, but still it happened a couple of times already, since the tests were parallelized. This may result in a false exception thrown from release(), when the lock file's delete() is called and returns false, because the file does not exist (deleted by another JVM already). In addition, release() should give a second attempt to delete() if it fails, since the file may be held temporarily by another process (like AntiVirus) before it fails. The proposed changes are:
> 1) Use ManagementFactory.getRuntimeMXBean().getName() as part of the test lock name (should include the process Id)
> 2) In release(), if delete() fails, check if the file indeed exists. If it is, let's attempt a re-delete() few ms later.
> 3) If (3) still fails, throw an exception. Alternatively, we can attempt a deleteOnExit.
> I'll post a patch later today.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


[jira] Updated: (LUCENE-2421) Hardening of NativeFSLock

Posted by "Shai Erera (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/LUCENE-2421?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Shai Erera updated LUCENE-2421:
-------------------------------

    Attachment: LUCENE-2421-2.patch

Patch fixes clearLock to first release the lock and then delete it. If the release fails, it means another process is holding the lock and therefore we shouldn't fail. If the delete fails, it ignores it.

I plan to commit shortly.

> Hardening of NativeFSLock
> -------------------------
>
>                 Key: LUCENE-2421
>                 URL: https://issues.apache.org/jira/browse/LUCENE-2421
>             Project: Lucene - Java
>          Issue Type: Improvement
>          Components: Index
>            Reporter: Shai Erera
>            Assignee: Shai Erera
>             Fix For: 2.9.3, 3.0.2, 3.1, 4.0
>
>         Attachments: LUCENE-2421-2.patch, LUCENE-2421.patch, LUCENE-2421.patch, LUCENE-2421.patch, LUCENE-2421.patch
>
>
> NativeFSLock create a test lock file which its name might collide w/ another JVM that is running. Very unlikely, but still it happened a couple of times already, since the tests were parallelized. This may result in a false exception thrown from release(), when the lock file's delete() is called and returns false, because the file does not exist (deleted by another JVM already). In addition, release() should give a second attempt to delete() if it fails, since the file may be held temporarily by another process (like AntiVirus) before it fails. The proposed changes are:
> 1) Use ManagementFactory.getRuntimeMXBean().getName() as part of the test lock name (should include the process Id)
> 2) In release(), if delete() fails, check if the file indeed exists. If it is, let's attempt a re-delete() few ms later.
> 3) If (3) still fails, throw an exception. Alternatively, we can attempt a deleteOnExit.
> I'll post a patch later today.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


[jira] Commented: (LUCENE-2421) Hardening of NativeFSLock

Posted by "Michael McCandless (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/LUCENE-2421?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12863299#action_12863299 ] 

Michael McCandless commented on LUCENE-2421:
--------------------------------------------

Well sleeping for so long isn't necessarily harmless ;)

And it's code to fix a problem that's only theoretical at this point -- we've never seen this actually happen -- so this violates 'design for today'.  Why fix something that may not even be an actual problem...

Also the lock file name has "-test" in it, which is a strong hint to the future hypothetical user that hits this.


> Hardening of NativeFSLock
> -------------------------
>
>                 Key: LUCENE-2421
>                 URL: https://issues.apache.org/jira/browse/LUCENE-2421
>             Project: Lucene - Java
>          Issue Type: Improvement
>          Components: Index
>            Reporter: Shai Erera
>            Assignee: Shai Erera
>             Fix For: 3.1
>
>         Attachments: LUCENE-2421.patch, LUCENE-2421.patch, LUCENE-2421.patch
>
>
> NativeFSLock create a test lock file which its name might collide w/ another JVM that is running. Very unlikely, but still it happened a couple of times already, since the tests were parallelized. This may result in a false exception thrown from release(), when the lock file's delete() is called and returns false, because the file does not exist (deleted by another JVM already). In addition, release() should give a second attempt to delete() if it fails, since the file may be held temporarily by another process (like AntiVirus) before it fails. The proposed changes are:
> 1) Use ManagementFactory.getRuntimeMXBean().getName() as part of the test lock name (should include the process Id)
> 2) In release(), if delete() fails, check if the file indeed exists. If it is, let's attempt a re-delete() few ms later.
> 3) If (3) still fails, throw an exception. Alternatively, we can attempt a deleteOnExit.
> I'll post a patch later today.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


[jira] Resolved: (LUCENE-2421) Hardening of NativeFSLock

Posted by "Shai Erera (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/LUCENE-2421?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Shai Erera resolved LUCENE-2421.
--------------------------------

    Resolution: Fixed

Committed revision 941361 (backport to 3x).

> Hardening of NativeFSLock
> -------------------------
>
>                 Key: LUCENE-2421
>                 URL: https://issues.apache.org/jira/browse/LUCENE-2421
>             Project: Lucene - Java
>          Issue Type: Improvement
>          Components: Index
>            Reporter: Shai Erera
>            Assignee: Shai Erera
>             Fix For: 3.1, 4.0
>
>         Attachments: LUCENE-2421.patch, LUCENE-2421.patch, LUCENE-2421.patch, LUCENE-2421.patch
>
>
> NativeFSLock create a test lock file which its name might collide w/ another JVM that is running. Very unlikely, but still it happened a couple of times already, since the tests were parallelized. This may result in a false exception thrown from release(), when the lock file's delete() is called and returns false, because the file does not exist (deleted by another JVM already). In addition, release() should give a second attempt to delete() if it fails, since the file may be held temporarily by another process (like AntiVirus) before it fails. The proposed changes are:
> 1) Use ManagementFactory.getRuntimeMXBean().getName() as part of the test lock name (should include the process Id)
> 2) In release(), if delete() fails, check if the file indeed exists. If it is, let's attempt a re-delete() few ms later.
> 3) If (3) still fails, throw an exception. Alternatively, we can attempt a deleteOnExit.
> I'll post a patch later today.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


[jira] Commented: (LUCENE-2421) Hardening of NativeFSLock

Posted by "Shai Erera (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/LUCENE-2421?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12863099#action_12863099 ] 

Shai Erera commented on LUCENE-2421:
------------------------------------

You're right Mark ! I did not do a thorough check on NativeFSLock.obtain(). In the beginning of the method, it calls lockExists() and I assumed it checks for the existence of the file. But it actually checks whether the FileLock is not null. So that means:
* If one obtains a Native lock and later a Simple lock obtain is attempted - it will fail.
* If one obtains a Simple lock and later a Native lock obtain is attempted - it will succeed.

I wrote the following code to demonstrate that:
{code}
SimpleFSLockFactory simple = new SimpleFSLockFactory(dir);
NativeFSLockFactory nativel = new NativeFSLockFactory(dir);
System.out.println(nativel.makeLock("test").obtain());
System.out.println(simple.makeLock("test").obtain());
{code}

This prints "true" and "false" while if you move the simple.makeLock line above the native, it prints "true" twice.

I don't know if that's a problem or not because it all boils down to whether we want to be nice if the user has made a mistake and used two lock factories in two different places of the code.

Given that, if we are ok to declare this is unsupported in the sense that the code won't play nice, then I'm ok w/ not failing if the lock file deletion fails, for both the regular and test lock. I think it makes sense to decide the code doesn't play nice, because someone can anyway extend LF and do such silly mistakes ...

> Hardening of NativeFSLock
> -------------------------
>
>                 Key: LUCENE-2421
>                 URL: https://issues.apache.org/jira/browse/LUCENE-2421
>             Project: Lucene - Java
>          Issue Type: Improvement
>          Components: Index
>            Reporter: Shai Erera
>            Assignee: Shai Erera
>             Fix For: 3.1
>
>         Attachments: LUCENE-2421.patch, LUCENE-2421.patch
>
>
> NativeFSLock create a test lock file which its name might collide w/ another JVM that is running. Very unlikely, but still it happened a couple of times already, since the tests were parallelized. This may result in a false exception thrown from release(), when the lock file's delete() is called and returns false, because the file does not exist (deleted by another JVM already). In addition, release() should give a second attempt to delete() if it fails, since the file may be held temporarily by another process (like AntiVirus) before it fails. The proposed changes are:
> 1) Use ManagementFactory.getRuntimeMXBean().getName() as part of the test lock name (should include the process Id)
> 2) In release(), if delete() fails, check if the file indeed exists. If it is, let's attempt a re-delete() few ms later.
> 3) If (3) still fails, throw an exception. Alternatively, we can attempt a deleteOnExit.
> I'll post a patch later today.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


[jira] Commented: (LUCENE-2421) Hardening of NativeFSLock

Posted by "Mark Miller (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/LUCENE-2421?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12862835#action_12862835 ] 

Mark Miller commented on LUCENE-2421:
-------------------------------------

bq. Wait, how come? (SimpleFSLF does this, but native gets a "real" lock).

Yeah, this doesn't make sense to me either - native file locks don't care if the file is preexisting or not. It shouldnt matter if the file is already there or needs to be created. Its just a dummy file to get the lock on - even you start Lucene and the file exists, there will be no lock on it, so it won't matter.

> Hardening of NativeFSLock
> -------------------------
>
>                 Key: LUCENE-2421
>                 URL: https://issues.apache.org/jira/browse/LUCENE-2421
>             Project: Lucene - Java
>          Issue Type: Improvement
>          Components: Index
>            Reporter: Shai Erera
>            Assignee: Shai Erera
>             Fix For: 3.1
>
>         Attachments: LUCENE-2421.patch
>
>
> NativeFSLock create a test lock file which its name might collide w/ another JVM that is running. Very unlikely, but still it happened a couple of times already, since the tests were parallelized. This may result in a false exception thrown from release(), when the lock file's delete() is called and returns false, because the file does not exist (deleted by another JVM already). In addition, release() should give a second attempt to delete() if it fails, since the file may be held temporarily by another process (like AntiVirus) before it fails. The proposed changes are:
> 1) Use ManagementFactory.getRuntimeMXBean().getName() as part of the test lock name (should include the process Id)
> 2) In release(), if delete() fails, check if the file indeed exists. If it is, let's attempt a re-delete() few ms later.
> 3) If (3) still fails, throw an exception. Alternatively, we can attempt a deleteOnExit.
> I'll post a patch later today.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


[jira] Commented: (LUCENE-2421) Hardening of NativeFSLock

Posted by "Shai Erera (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/LUCENE-2421?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12862666#action_12862666 ] 

Shai Erera commented on LUCENE-2421:
------------------------------------

I'm having second thoughts about using MXBean, since the name returned includes invalid filename characters such as @. Will need to check that next time I'm infront of the code, but perhaps w/ all the other changes that might not be necessary ...

> Hardening of NativeFSLock
> -------------------------
>
>                 Key: LUCENE-2421
>                 URL: https://issues.apache.org/jira/browse/LUCENE-2421
>             Project: Lucene - Java
>          Issue Type: Improvement
>          Components: Index
>            Reporter: Shai Erera
>            Assignee: Shai Erera
>             Fix For: 3.1
>
>         Attachments: LUCENE-2421.patch
>
>
> NativeFSLock create a test lock file which its name might collide w/ another JVM that is running. Very unlikely, but still it happened a couple of times already, since the tests were parallelized. This may result in a false exception thrown from release(), when the lock file's delete() is called and returns false, because the file does not exist (deleted by another JVM already). In addition, release() should give a second attempt to delete() if it fails, since the file may be held temporarily by another process (like AntiVirus) before it fails. The proposed changes are:
> 1) Use ManagementFactory.getRuntimeMXBean().getName() as part of the test lock name (should include the process Id)
> 2) In release(), if delete() fails, check if the file indeed exists. If it is, let's attempt a re-delete() few ms later.
> 3) If (3) still fails, throw an exception. Alternatively, we can attempt a deleteOnExit.
> I'll post a patch later today.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


[jira] Reopened: (LUCENE-2421) Hardening of NativeFSLock

Posted by "Shai Erera (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/LUCENE-2421?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Shai Erera reopened LUCENE-2421:
--------------------------------


Back-port to 3.1 as well

> Hardening of NativeFSLock
> -------------------------
>
>                 Key: LUCENE-2421
>                 URL: https://issues.apache.org/jira/browse/LUCENE-2421
>             Project: Lucene - Java
>          Issue Type: Improvement
>          Components: Index
>            Reporter: Shai Erera
>            Assignee: Shai Erera
>             Fix For: 3.1, 4.0
>
>         Attachments: LUCENE-2421.patch, LUCENE-2421.patch, LUCENE-2421.patch, LUCENE-2421.patch
>
>
> NativeFSLock create a test lock file which its name might collide w/ another JVM that is running. Very unlikely, but still it happened a couple of times already, since the tests were parallelized. This may result in a false exception thrown from release(), when the lock file's delete() is called and returns false, because the file does not exist (deleted by another JVM already). In addition, release() should give a second attempt to delete() if it fails, since the file may be held temporarily by another process (like AntiVirus) before it fails. The proposed changes are:
> 1) Use ManagementFactory.getRuntimeMXBean().getName() as part of the test lock name (should include the process Id)
> 2) In release(), if delete() fails, check if the file indeed exists. If it is, let's attempt a re-delete() few ms later.
> 3) If (3) still fails, throw an exception. Alternatively, we can attempt a deleteOnExit.
> I'll post a patch later today.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


[jira] Commented: (LUCENE-2421) Hardening of NativeFSLock

Posted by "Michael McCandless (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/LUCENE-2421?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12863284#action_12863284 ] 

Michael McCandless commented on LUCENE-2421:
--------------------------------------------

Hmm so if failure to delete is no longer a serious issue (ie we are not going to throw an exception), do we really need to sleep for a long time & retry the deletion any more?

This is all based on a hypothetical situation right ("we may fail to delete the test file we just created, and it still exists")?  Maybe we should wait until this is really an issue (ie, a user actually hits it), before implementing this workaround code?

> Hardening of NativeFSLock
> -------------------------
>
>                 Key: LUCENE-2421
>                 URL: https://issues.apache.org/jira/browse/LUCENE-2421
>             Project: Lucene - Java
>          Issue Type: Improvement
>          Components: Index
>            Reporter: Shai Erera
>            Assignee: Shai Erera
>             Fix For: 3.1
>
>         Attachments: LUCENE-2421.patch, LUCENE-2421.patch, LUCENE-2421.patch
>
>
> NativeFSLock create a test lock file which its name might collide w/ another JVM that is running. Very unlikely, but still it happened a couple of times already, since the tests were parallelized. This may result in a false exception thrown from release(), when the lock file's delete() is called and returns false, because the file does not exist (deleted by another JVM already). In addition, release() should give a second attempt to delete() if it fails, since the file may be held temporarily by another process (like AntiVirus) before it fails. The proposed changes are:
> 1) Use ManagementFactory.getRuntimeMXBean().getName() as part of the test lock name (should include the process Id)
> 2) In release(), if delete() fails, check if the file indeed exists. If it is, let's attempt a re-delete() few ms later.
> 3) If (3) still fails, throw an exception. Alternatively, we can attempt a deleteOnExit.
> I'll post a patch later today.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


[jira] Commented: (LUCENE-2421) Hardening of NativeFSLock

Posted by "Shai Erera (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/LUCENE-2421?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12862973#action_12862973 ] 

Shai Erera commented on LUCENE-2421:
------------------------------------

bq. Lock factories do not have to work with all other lock factories...you need to use the same lock factory across all process'.

I agree, but I've grown accustomed here that we try to protect users from their own mistakes. I.e., if the factory is a parameter to the application, this can certainly happen, w/o any of the application owners even realize that ... so it just seems dangerous to me, and I think that the alternative of keeping the file there and failing elsewhere is not good.

Another scenario - one application opens the index on a local file system (which is shared) and uses Native, while the other, for safety reason, opens the index using Simple lock because it reads the index from a remote share ... does not make a lot of sense - but possible.

bq. or even that the native lock feel needs to be in the same dir (eg it could be in a tmp dir)

Good point - but it's irrelevant to that issue - one can already shoot himself in the leg today by doing that. That's something I don't think we can protect from ... but allowing two lock files in the same directory - that just seems like a bug. Still, one could impl MyOwnNativeFSLock and create "my.lock.file" ... but we cannot protect from that either. So at least, and that's my own opinion, the core factories that are provided w/ Lucene should behave well.

So Mark - I agreed w/ you before on that, until Uwe brought up the reason why NativeFSLock was fixed in 2.9 to use the same lock file as Simple ... it seems like a good catch to me, and I think if it was fixed once already, do we really want to break that fix? That will certainly be a bw break, because now NativeFSLock could obtain a lock if a lock file is there (I'm not sure if we have tests that cover that scenario, but I have a feeling some will fail if we change that).

If the result of that discussion is that we should not fail if the lock file cannot be deleted, then I think we should rename it too, so that Native and Simple use different files and it is clear that you're using two different LFs, which is not supported?

> Hardening of NativeFSLock
> -------------------------
>
>                 Key: LUCENE-2421
>                 URL: https://issues.apache.org/jira/browse/LUCENE-2421
>             Project: Lucene - Java
>          Issue Type: Improvement
>          Components: Index
>            Reporter: Shai Erera
>            Assignee: Shai Erera
>             Fix For: 3.1
>
>         Attachments: LUCENE-2421.patch, LUCENE-2421.patch
>
>
> NativeFSLock create a test lock file which its name might collide w/ another JVM that is running. Very unlikely, but still it happened a couple of times already, since the tests were parallelized. This may result in a false exception thrown from release(), when the lock file's delete() is called and returns false, because the file does not exist (deleted by another JVM already). In addition, release() should give a second attempt to delete() if it fails, since the file may be held temporarily by another process (like AntiVirus) before it fails. The proposed changes are:
> 1) Use ManagementFactory.getRuntimeMXBean().getName() as part of the test lock name (should include the process Id)
> 2) In release(), if delete() fails, check if the file indeed exists. If it is, let's attempt a re-delete() few ms later.
> 3) If (3) still fails, throw an exception. Alternatively, we can attempt a deleteOnExit.
> I'll post a patch later today.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


[jira] Updated: (LUCENE-2421) Hardening of NativeFSLock

Posted by "Shai Erera (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/LUCENE-2421?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Shai Erera updated LUCENE-2421:
-------------------------------

    Attachment: LUCENE-2421.patch

Patch w/ the proposed changes and a fallback to deleteOnExit in acquireTestLock, if deleting the test lock fails still.

BTW, I've used 100ms for the re-delete attempt delay. Do you think that's enough, or should I use a lower value, or perhaps IndexWriterConfig.getDefaultWriteLockTimeout (which is 1s unless changed)?

> Hardening of NativeFSLock
> -------------------------
>
>                 Key: LUCENE-2421
>                 URL: https://issues.apache.org/jira/browse/LUCENE-2421
>             Project: Lucene - Java
>          Issue Type: Improvement
>          Components: Index
>            Reporter: Shai Erera
>            Assignee: Shai Erera
>             Fix For: 3.1
>
>         Attachments: LUCENE-2421.patch
>
>
> NativeFSLock create a test lock file which its name might collide w/ another JVM that is running. Very unlikely, but still it happened a couple of times already, since the tests were parallelized. This may result in a false exception thrown from release(), when the lock file's delete() is called and returns false, because the file does not exist (deleted by another JVM already). In addition, release() should give a second attempt to delete() if it fails, since the file may be held temporarily by another process (like AntiVirus) before it fails. The proposed changes are:
> 1) Use ManagementFactory.getRuntimeMXBean().getName() as part of the test lock name (should include the process Id)
> 2) In release(), if delete() fails, check if the file indeed exists. If it is, let's attempt a re-delete() few ms later.
> 3) If (3) still fails, throw an exception. Alternatively, we can attempt a deleteOnExit.
> I'll post a patch later today.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org