You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@lucene.apache.org by "Uwe Schindler (JIRA)" <ji...@apache.org> on 2015/09/04 21:27:45 UTC

[jira] [Comment Edited] (LUCENE-6770) FSDirectory ctor should use getAbsolutePath instead of getRealPath for directory

    [ https://issues.apache.org/jira/browse/LUCENE-6770?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14731290#comment-14731290 ] 

Uwe Schindler edited comment on LUCENE-6770 at 9/4/15 7:27 PM:
---------------------------------------------------------------

The reason why we have the canonic path is the following: The NativeFSLockFactory has the limitation that the underlying OS does not lock files for the same process. It only prevents other processes from using the locked file/directory. So NativeFSLockFactory internally uses a static set of "locked index directories" and during aquiring locks it first checks if the given directory is in the set. If this is true it refuses to aquire lock. Otherwise it fall backs to OS kernel in checking the lock.
For this check with a simple set to work correctly, the path must be canonic. If this is not done, it may happen that a user opens in the same JVM an index with 2 different Path objects (which somehow point to same dir using symlink/hardlinks/junctions), causing index corrumption.
As getting canonic path is quite expensive, we dont expand it on every try to lock (which may also break if people change links while having index open). So we do it on FSDirectory init.
To work around the issue mentioned here, one possibility would be to save the original Path as given in Ctor and return that one getDirectory(). The canonic path would be an implementation detail.


was (Author: thetaphi):
The reason why we have the canonic path is the following: The NativeFSLockFactory has the limitation that the underlying OS does not lock files for the same process. It only prevents other processes from using the locked file/directory. So NativeFSLockFactory internally uses a static set of "locked index directories" and during aquiring locks it first checks if the given directory is in the set. If this is true it refuses to aquire lock. Otherwise it fall backs to OS kernel in checking the lock.
For this check with a simple set to work correctly, the path must be canonic. If this is not done, it may happen that a user opens in the same JVM an index with 2 different Path objects (which somehow point to same dir using symlink/hardlinks/junctions), leading broken indexes.
As getting canonic path is quite expensive, we dont expand it on every try to lock (which may also break if people change links while having index open). So we do it on FSDirectory init.
To work around the issue mentioned here, one possibility would be to save the original Path as given in Ctor and return that one getDirectory(). The canonic path would be an implementation detail.

> FSDirectory ctor should use getAbsolutePath instead of getRealPath for directory
> --------------------------------------------------------------------------------
>
>                 Key: LUCENE-6770
>                 URL: https://issues.apache.org/jira/browse/LUCENE-6770
>             Project: Lucene - Core
>          Issue Type: Improvement
>          Components: core/store
>    Affects Versions: 5.2.1
>         Environment: OS X, Linux
>            Reporter: Vladimir Kuzmin
>         Attachments: LUCENE-6770.patch
>
>
> After upgrade from 4.1 to 5.2.1 I found that one of our test failed. Appeared the guilty was FSDirectory that converts given Path to Path.getRealPath. As result the test will fail:
> Path p = Paths.get("/var/lucene_store");
> FSDirectory d = new FSDirectory(p);
> assertEquals(p.toString(), d.getDirectory().toString());
> It because /var/lucene_store is a symlink and 
> Path directory =path.getRealPath(); 
> resolves it to /private/var/lucene_store
> I think this is bad design decision because "direcrory" isn't just internal state but is exposed in a public interface and "getDirectory()" is widely used to initialize other components. 
> It should use paths.getAbsolutePath() instead.
> build and "ant test" were successful after fix.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org