You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@lucene.apache.org by "Michael McCandless (JIRA)" <ji...@apache.org> on 2010/12/03 16:55:10 UTC

[jira] Created: (LUCENE-2795) Genericize DirectIOLinuxDir -> UnixDir

Genericize DirectIOLinuxDir -> UnixDir
--------------------------------------

                 Key: LUCENE-2795
                 URL: https://issues.apache.org/jira/browse/LUCENE-2795
             Project: Lucene - Java
          Issue Type: Improvement
          Components: Store
            Reporter: Michael McCandless


Today DirectIOLinuxDir is tricky/dangerous to use, because you only want to use it for indexWriter and not IndexReader (searching).  It's a trap.

But, once we do LUCENE-2793, we can make it fully general purpose because then a single native Dir impl can be used.

I'd also like to make it generic to other Unices, if we can, so that it becomes UnixDirectory.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


[jira] [Commented] (LUCENE-2795) Genericize DirectIOLinuxDir -> UnixDir

Posted by "Varun Thacker (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/LUCENE-2795?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13060331#comment-13060331 ] 

Varun Thacker commented on LUCENE-2795:
---------------------------------------

{quote}
Hey Varun - saw you asked for someone with a mac to run some code for you in IRC but you popped off before I saw - what do you need? Just apply the patch and run the tests?
{quote} 

This patch will apply to the LUCENE2793 branch. Othewise in file : lucene/contrib/misc/src/java/org/apache/lucene/store/NativePosixUtil.cpp after line 117 inside the "if" add this line - {code} fcntl(fd, F_NOCACHE, 1); {code}

And then by running {code}ant build-native-unix{code} from the /contrib/misc folder to check if it compiles successfully. 

Thanks.

> Genericize DirectIOLinuxDir -> UnixDir
> --------------------------------------
>
>                 Key: LUCENE-2795
>                 URL: https://issues.apache.org/jira/browse/LUCENE-2795
>             Project: Lucene - Java
>          Issue Type: Improvement
>          Components: core/store
>            Reporter: Michael McCandless
>            Assignee: Varun Thacker
>              Labels: gsoc2011, lucene-gsoc-11, mentor
>         Attachments: LUCENE-2795.patch
>
>
> Today DirectIOLinuxDir is tricky/dangerous to use, because you only want to use it for indexWriter and not IndexReader (searching).  It's a trap.
> But, once we do LUCENE-2793, we can make it fully general purpose because then a single native Dir impl can be used.
> I'd also like to make it generic to other Unices, if we can, so that it becomes UnixDirectory.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


[jira] [Updated] (LUCENE-2795) Genericize DirectIOLinuxDir -> UnixDir

Posted by "Varun Thacker (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/LUCENE-2795?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Varun Thacker updated LUCENE-2795:
----------------------------------

    Attachment: LUCENE-2795.patch

I'm not sure this is the correct/best way to this. 

> Genericize DirectIOLinuxDir -> UnixDir
> --------------------------------------
>
>                 Key: LUCENE-2795
>                 URL: https://issues.apache.org/jira/browse/LUCENE-2795
>             Project: Lucene - Java
>          Issue Type: Improvement
>          Components: core/store
>            Reporter: Michael McCandless
>            Assignee: Varun Thacker
>              Labels: gsoc2011, lucene-gsoc-11, mentor
>         Attachments: LUCENE-2795.patch, LUCENE-2795.patch, LUCENE-2795.patch
>
>
> Today DirectIOLinuxDir is tricky/dangerous to use, because you only want to use it for indexWriter and not IndexReader (searching).  It's a trap.
> But, once we do LUCENE-2793, we can make it fully general purpose because then a single native Dir impl can be used.
> I'd also like to make it generic to other Unices, if we can, so that it becomes UnixDirectory.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


[jira] [Assigned] (LUCENE-2795) Genericize DirectIOLinuxDir -> UnixDir

Posted by "Simon Willnauer (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/LUCENE-2795?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Simon Willnauer reassigned LUCENE-2795:
---------------------------------------

    Assignee: Simon Willnauer

> Genericize DirectIOLinuxDir -> UnixDir
> --------------------------------------
>
>                 Key: LUCENE-2795
>                 URL: https://issues.apache.org/jira/browse/LUCENE-2795
>             Project: Lucene - Java
>          Issue Type: Improvement
>          Components: Store
>            Reporter: Michael McCandless
>            Assignee: Simon Willnauer
>              Labels: gsoc2011, lucene-gsoc-11, mentor
>
> Today DirectIOLinuxDir is tricky/dangerous to use, because you only want to use it for indexWriter and not IndexReader (searching).  It's a trap.
> But, once we do LUCENE-2793, we can make it fully general purpose because then a single native Dir impl can be used.
> I'd also like to make it generic to other Unices, if we can, so that it becomes UnixDirectory.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


[jira] [Commented] (LUCENE-2795) Genericize DirectIOLinuxDir -> UnixDir

Posted by "Michael McCandless (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/LUCENE-2795?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13067105#comment-13067105 ] 

Michael McCandless commented on LUCENE-2795:
--------------------------------------------

Also, we should run some more tests here, on modern kernels, since the SEQUENTIAL fadvise flag is better implemented in kernels >= 2.6.29; see this comment:

http://blog.mikemccandless.com/2010/06/lucene-and-fadvisemadvise.html?showComment=1303235497682#c2572106601600642254

It could be, if kernel is new enough, we should not do DIRECT and instead do SEQUENTIAL.

> Genericize DirectIOLinuxDir -> UnixDir
> --------------------------------------
>
>                 Key: LUCENE-2795
>                 URL: https://issues.apache.org/jira/browse/LUCENE-2795
>             Project: Lucene - Java
>          Issue Type: Improvement
>          Components: core/store
>            Reporter: Michael McCandless
>            Assignee: Varun Thacker
>              Labels: gsoc2011, lucene-gsoc-11, mentor
>         Attachments: LUCENE-2795.patch, LUCENE-2795.patch, LUCENE-2795.patch, LUCENE-2795.patch, LUCENE-2795.patch
>
>
> Today DirectIOLinuxDir is tricky/dangerous to use, because you only want to use it for indexWriter and not IndexReader (searching).  It's a trap.
> But, once we do LUCENE-2793, we can make it fully general purpose because then a single native Dir impl can be used.
> I'd also like to make it generic to other Unices, if we can, so that it becomes UnixDirectory.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


[jira] [Updated] (LUCENE-2795) Genericize DirectIOLinuxDir -> UnixDir

Posted by "Varun Thacker (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/LUCENE-2795?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Varun Thacker updated LUCENE-2795:
----------------------------------

    Attachment: LUCENE-2795.patch

Initial Patch..

This will apply to the IOContext branch. This patch also contains some uncommitted code from LUCENE-2793. 

In NativePosixUtil.cpp the open_direct method where I use fcntl() for OS X could not be compiled on my machine because of the OS specific F_NOCACHE flag used. Let me know if the usage is correct. 

> Genericize DirectIOLinuxDir -> UnixDir
> --------------------------------------
>
>                 Key: LUCENE-2795
>                 URL: https://issues.apache.org/jira/browse/LUCENE-2795
>             Project: Lucene - Java
>          Issue Type: Improvement
>          Components: core/store
>            Reporter: Michael McCandless
>            Assignee: Simon Willnauer
>              Labels: gsoc2011, lucene-gsoc-11, mentor
>         Attachments: LUCENE-2795.patch
>
>
> Today DirectIOLinuxDir is tricky/dangerous to use, because you only want to use it for indexWriter and not IndexReader (searching).  It's a trap.
> But, once we do LUCENE-2793, we can make it fully general purpose because then a single native Dir impl can be used.
> I'd also like to make it generic to other Unices, if we can, so that it becomes UnixDirectory.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


[jira] [Assigned] (LUCENE-2795) Genericize DirectIOLinuxDir -> UnixDir

Posted by "Varun Thacker (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/LUCENE-2795?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Varun Thacker reassigned LUCENE-2795:
-------------------------------------

    Assignee: Varun Thacker  (was: Simon Willnauer)

> Genericize DirectIOLinuxDir -> UnixDir
> --------------------------------------
>
>                 Key: LUCENE-2795
>                 URL: https://issues.apache.org/jira/browse/LUCENE-2795
>             Project: Lucene - Java
>          Issue Type: Improvement
>          Components: core/store
>            Reporter: Michael McCandless
>            Assignee: Varun Thacker
>              Labels: gsoc2011, lucene-gsoc-11, mentor
>         Attachments: LUCENE-2795.patch
>
>
> Today DirectIOLinuxDir is tricky/dangerous to use, because you only want to use it for indexWriter and not IndexReader (searching).  It's a trap.
> But, once we do LUCENE-2793, we can make it fully general purpose because then a single native Dir impl can be used.
> I'd also like to make it generic to other Unices, if we can, so that it becomes UnixDirectory.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


[jira] [Updated] (LUCENE-2795) Genericize DirectIOLinuxDir -> UnixDir

Posted by "Varun Thacker (Updated) (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/LUCENE-2795?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Varun Thacker updated LUCENE-2795:
----------------------------------

    Attachment: LUCENE-2795.patch

I updated the patch so that it applies cleanly with the current trunk revision.

{noformat}ant build-native-unix{noformat}  Comiles successfully on a 64 bit Ubuntu Linux machine running kernel version 3.0.0-14-generic. 

{noformat}ant-DDirectory=NativeUnixDirectory test-core{noformat} 
Runs successfully.

Is more testing required before committing it?
                
> Genericize DirectIOLinuxDir -> UnixDir
> --------------------------------------
>
>                 Key: LUCENE-2795
>                 URL: https://issues.apache.org/jira/browse/LUCENE-2795
>             Project: Lucene - Java
>          Issue Type: Improvement
>          Components: core/store
>            Reporter: Michael McCandless
>            Assignee: Varun Thacker
>              Labels: gsoc2011, lucene-gsoc-11, mentor
>         Attachments: LUCENE-2795.patch, LUCENE-2795.patch, LUCENE-2795.patch, LUCENE-2795.patch, LUCENE-2795.patch, LUCENE-2795.patch, LUCENE-2795.patch, LUCENE-2795.patch, LUCENE-2795.patch
>
>
> Today DirectIOLinuxDir is tricky/dangerous to use, because you only want to use it for indexWriter and not IndexReader (searching).  It's a trap.
> But, once we do LUCENE-2793, we can make it fully general purpose because then a single native Dir impl can be used.
> I'd also like to make it generic to other Unices, if we can, so that it becomes UnixDirectory.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


[jira] [Commented] (LUCENE-2795) Genericize DirectIOLinuxDir -> UnixDir

Posted by "Michael McCandless (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/LUCENE-2795?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13187045#comment-13187045 ] 

Michael McCandless commented on LUCENE-2795:
--------------------------------------------

OK tests pass (slowly: 46 minutes) on OS X as well.

I think this is ready!
                
> Genericize DirectIOLinuxDir -> UnixDir
> --------------------------------------
>
>                 Key: LUCENE-2795
>                 URL: https://issues.apache.org/jira/browse/LUCENE-2795
>             Project: Lucene - Java
>          Issue Type: Improvement
>          Components: core/store
>            Reporter: Michael McCandless
>            Assignee: Varun Thacker
>              Labels: gsoc2011, lucene-gsoc-11, mentor
>         Attachments: LUCENE-2795.patch, LUCENE-2795.patch, LUCENE-2795.patch, LUCENE-2795.patch, LUCENE-2795.patch, LUCENE-2795.patch, LUCENE-2795.patch, LUCENE-2795.patch, LUCENE-2795.patch, LUCENE-2795.patch
>
>
> Today DirectIOLinuxDir is tricky/dangerous to use, because you only want to use it for indexWriter and not IndexReader (searching).  It's a trap.
> But, once we do LUCENE-2793, we can make it fully general purpose because then a single native Dir impl can be used.
> I'd also like to make it generic to other Unices, if we can, so that it becomes UnixDirectory.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


[jira] [Commented] (LUCENE-2795) Genericize DirectIOLinuxDir -> UnixDir

Posted by "Michael McCandless (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/LUCENE-2795?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13024631#comment-13024631 ] 

Michael McCandless commented on LUCENE-2795:
--------------------------------------------

Today, we don't use any of these flags for searching/merging (that's what this issue and LUCENE-2793 will fix).

Unfortunately, it looks like the best IO flags are going to vary depending on kernel version.  The above patch was applied in kernel 2.6.29, and causes SEQUENTIAL to act like NOREUSE, and so we should use SEQUENTIAL for merging but not for searching, I think?  We should experiment and see...

Really, for searching we want the "old" SEQUENTIAL behavior (ie, do aggressive readahead, but use the normal eviction policy since that term may be searched on again).

Hopefully NativeUnixDir will take care of all these os/kernel-version dependent flags for you.

> Genericize DirectIOLinuxDir -> UnixDir
> --------------------------------------
>
>                 Key: LUCENE-2795
>                 URL: https://issues.apache.org/jira/browse/LUCENE-2795
>             Project: Lucene - Java
>          Issue Type: Improvement
>          Components: Store
>            Reporter: Michael McCandless
>            Assignee: Simon Willnauer
>              Labels: gsoc2011, lucene-gsoc-11, mentor
>
> Today DirectIOLinuxDir is tricky/dangerous to use, because you only want to use it for indexWriter and not IndexReader (searching).  It's a trap.
> But, once we do LUCENE-2793, we can make it fully general purpose because then a single native Dir impl can be used.
> I'd also like to make it generic to other Unices, if we can, so that it becomes UnixDirectory.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


[jira] Updated: (LUCENE-2795) Genericize DirectIOLinuxDir -> UnixDir

Posted by "Michael McCandless (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/LUCENE-2795?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Michael McCandless updated LUCENE-2795:
---------------------------------------

    Labels: gsoc2011 lucene-gsoc-11  (was: )

> Genericize DirectIOLinuxDir -> UnixDir
> --------------------------------------
>
>                 Key: LUCENE-2795
>                 URL: https://issues.apache.org/jira/browse/LUCENE-2795
>             Project: Lucene - Java
>          Issue Type: Improvement
>          Components: Store
>            Reporter: Michael McCandless
>              Labels: gsoc2011, lucene-gsoc-11
>
> Today DirectIOLinuxDir is tricky/dangerous to use, because you only want to use it for indexWriter and not IndexReader (searching).  It's a trap.
> But, once we do LUCENE-2793, we can make it fully general purpose because then a single native Dir impl can be used.
> I'd also like to make it generic to other Unices, if we can, so that it becomes UnixDirectory.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


[jira] Updated: (LUCENE-2795) Genericize DirectIOLinuxDir -> UnixDir

Posted by "Simon Willnauer (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/LUCENE-2795?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Simon Willnauer updated LUCENE-2795:
------------------------------------

    Labels: gsoc2011, lucene-gsoc-11 mentor,  (was: mentor)

> Genericize DirectIOLinuxDir -> UnixDir
> --------------------------------------
>
>                 Key: LUCENE-2795
>                 URL: https://issues.apache.org/jira/browse/LUCENE-2795
>             Project: Lucene - Java
>          Issue Type: Improvement
>          Components: Store
>            Reporter: Michael McCandless
>              Labels: gsoc2011,, lucene-gsoc-11, mentor,
>
> Today DirectIOLinuxDir is tricky/dangerous to use, because you only want to use it for indexWriter and not IndexReader (searching).  It's a trap.
> But, once we do LUCENE-2793, we can make it fully general purpose because then a single native Dir impl can be used.
> I'd also like to make it generic to other Unices, if we can, so that it becomes UnixDirectory.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


[jira] Updated: (LUCENE-2795) Genericize DirectIOLinuxDir -> UnixDir

Posted by "Simon Willnauer (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/LUCENE-2795?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Simon Willnauer updated LUCENE-2795:
------------------------------------

    Labels: mentor  (was: gsoc2011 lucene-gsoc-11)

> Genericize DirectIOLinuxDir -> UnixDir
> --------------------------------------
>
>                 Key: LUCENE-2795
>                 URL: https://issues.apache.org/jira/browse/LUCENE-2795
>             Project: Lucene - Java
>          Issue Type: Improvement
>          Components: Store
>            Reporter: Michael McCandless
>              Labels: mentor
>
> Today DirectIOLinuxDir is tricky/dangerous to use, because you only want to use it for indexWriter and not IndexReader (searching).  It's a trap.
> But, once we do LUCENE-2793, we can make it fully general purpose because then a single native Dir impl can be used.
> I'd also like to make it generic to other Unices, if we can, so that it becomes UnixDirectory.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


[jira] Commented: (LUCENE-2795) Genericize DirectIOLinuxDir -> UnixDir

Posted by "Michael McCandless (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/LUCENE-2795?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12979719#action_12979719 ] 

Michael McCandless commented on LUCENE-2795:
--------------------------------------------

https://issues.apache.org/jira/browse/LUCENE-2500?focusedCommentId=12979588&page=com.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel#action_12979588 has details on what flags to pass to OS X to bypass its buffer cache...

> Genericize DirectIOLinuxDir -> UnixDir
> --------------------------------------
>
>                 Key: LUCENE-2795
>                 URL: https://issues.apache.org/jira/browse/LUCENE-2795
>             Project: Lucene - Java
>          Issue Type: Improvement
>          Components: Store
>            Reporter: Michael McCandless
>
> Today DirectIOLinuxDir is tricky/dangerous to use, because you only want to use it for indexWriter and not IndexReader (searching).  It's a trap.
> But, once we do LUCENE-2793, we can make it fully general purpose because then a single native Dir impl can be used.
> I'd also like to make it generic to other Unices, if we can, so that it becomes UnixDirectory.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


[jira] [Commented] (LUCENE-2795) Genericize DirectIOLinuxDir -> UnixDir

Posted by "Varun Thacker (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/LUCENE-2795?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13023642#comment-13023642 ] 

Varun Thacker commented on LUCENE-2795:
---------------------------------------

This is great for merging . Does IndexReader use the SEQUENTIAL flag for accessing the index ?

> Genericize DirectIOLinuxDir -> UnixDir
> --------------------------------------
>
>                 Key: LUCENE-2795
>                 URL: https://issues.apache.org/jira/browse/LUCENE-2795
>             Project: Lucene - Java
>          Issue Type: Improvement
>          Components: Store
>            Reporter: Michael McCandless
>            Assignee: Simon Willnauer
>              Labels: gsoc2011, lucene-gsoc-11, mentor
>
> Today DirectIOLinuxDir is tricky/dangerous to use, because you only want to use it for indexWriter and not IndexReader (searching).  It's a trap.
> But, once we do LUCENE-2793, we can make it fully general purpose because then a single native Dir impl can be used.
> I'd also like to make it generic to other Unices, if we can, so that it becomes UnixDirectory.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


[jira] [Resolved] (LUCENE-2795) Genericize DirectIOLinuxDir -> UnixDir

Posted by "Michael McCandless (Resolved) (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/LUCENE-2795?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Michael McCandless resolved LUCENE-2795.
----------------------------------------

       Resolution: Fixed
    Fix Version/s: 4.0

Thanks Varun!
                
> Genericize DirectIOLinuxDir -> UnixDir
> --------------------------------------
>
>                 Key: LUCENE-2795
>                 URL: https://issues.apache.org/jira/browse/LUCENE-2795
>             Project: Lucene - Java
>          Issue Type: Improvement
>          Components: core/store
>            Reporter: Michael McCandless
>            Assignee: Varun Thacker
>              Labels: gsoc2011, lucene-gsoc-11, mentor
>             Fix For: 4.0
>
>         Attachments: LUCENE-2795.patch, LUCENE-2795.patch, LUCENE-2795.patch, LUCENE-2795.patch, LUCENE-2795.patch, LUCENE-2795.patch, LUCENE-2795.patch, LUCENE-2795.patch, LUCENE-2795.patch, LUCENE-2795.patch
>
>
> Today DirectIOLinuxDir is tricky/dangerous to use, because you only want to use it for indexWriter and not IndexReader (searching).  It's a trap.
> But, once we do LUCENE-2793, we can make it fully general purpose because then a single native Dir impl can be used.
> I'd also like to make it generic to other Unices, if we can, so that it becomes UnixDirectory.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


[jira] [Commented] (LUCENE-2795) Genericize DirectIOLinuxDir -> UnixDir

Posted by "Michael McCandless (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/LUCENE-2795?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13067146#comment-13067146 ] 

Michael McCandless commented on LUCENE-2795:
--------------------------------------------

OK, passes for me on OS X 10.6.6.  I first had to install the Java
developer package (jni.h was missing) from http://connect.apple.com


> Genericize DirectIOLinuxDir -> UnixDir
> --------------------------------------
>
>                 Key: LUCENE-2795
>                 URL: https://issues.apache.org/jira/browse/LUCENE-2795
>             Project: Lucene - Java
>          Issue Type: Improvement
>          Components: core/store
>            Reporter: Michael McCandless
>            Assignee: Varun Thacker
>              Labels: gsoc2011, lucene-gsoc-11, mentor
>         Attachments: LUCENE-2795.patch, LUCENE-2795.patch, LUCENE-2795.patch, LUCENE-2795.patch, LUCENE-2795.patch
>
>
> Today DirectIOLinuxDir is tricky/dangerous to use, because you only want to use it for indexWriter and not IndexReader (searching).  It's a trap.
> But, once we do LUCENE-2793, we can make it fully general purpose because then a single native Dir impl can be used.
> I'd also like to make it generic to other Unices, if we can, so that it becomes UnixDirectory.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


[jira] [Updated] (LUCENE-2795) Genericize DirectIOLinuxDir -> UnixDir

Posted by "Varun Thacker (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/LUCENE-2795?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Varun Thacker updated LUCENE-2795:
----------------------------------

    Attachment: LUCENE-2795.patch

Updated patch.

http://ant.apache.org/manual/Tasks/conditions.html#os

With this I can get information on the kernel. But the problem is "Linux" is not a separate family attribute. Any pointers on how to go about it so that we can
bq. Also, we should run some more tests here, on modern kernels, since the SEQUENTIAL fadvise flag is better implemented in kernels >= 2.6.29; see this comment:

bq. http://blog.mikemccandless.com/2010/06/lucene-and-fadvisemadvise.html?showComment=0000000000000#c2572106601600642254

bq. It could be, if kernel is new enough, we should not do DIRECT and instead do SEQUENTIAL.

> Genericize DirectIOLinuxDir -> UnixDir
> --------------------------------------
>
>                 Key: LUCENE-2795
>                 URL: https://issues.apache.org/jira/browse/LUCENE-2795
>             Project: Lucene - Java
>          Issue Type: Improvement
>          Components: core/store
>            Reporter: Michael McCandless
>            Assignee: Varun Thacker
>              Labels: gsoc2011, lucene-gsoc-11, mentor
>         Attachments: LUCENE-2795.patch, LUCENE-2795.patch, LUCENE-2795.patch, LUCENE-2795.patch, LUCENE-2795.patch, LUCENE-2795.patch, LUCENE-2795.patch, LUCENE-2795.patch
>
>
> Today DirectIOLinuxDir is tricky/dangerous to use, because you only want to use it for indexWriter and not IndexReader (searching).  It's a trap.
> But, once we do LUCENE-2793, we can make it fully general purpose because then a single native Dir impl can be used.
> I'd also like to make it generic to other Unices, if we can, so that it becomes UnixDirectory.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


[jira] [Commented] (LUCENE-2795) Genericize DirectIOLinuxDir -> UnixDir

Posted by "Michael McCandless (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/LUCENE-2795?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13067097#comment-13067097 ] 

Michael McCandless commented on LUCENE-2795:
--------------------------------------------

Can you touch up the javadocs?  Ie it's more general now (not just
linux), and "uses direct io" instead of "Linux's O_DIRECT flag", and
uses direct io for all merge input & output.  Make it clear that this
is a general purpose dir impl...

NativeUnixndexOutput should be NativeUnixIndexOutput.

The createOutput function should also switch between normal vs direct,
depending on MERGE or READ?

I think we should have a default forced buffer size?  The thing to
keep in mind is how many open inputs/outputs we will have during
merging, eg I think something like 5 or 8 files per merged segment
plus another 5 or 8 for the resulting merged segment, so up to 88 file
handles open.  Maybe 256 KB default?

We shouldn't by default use the BII's buffer size: it's way too small
for direct IO.  And then we should always use the forced buffer size
in this dir...

Tests pass for me on Fedora 13 linux!  Next I'll try OS X.

> Genericize DirectIOLinuxDir -> UnixDir
> --------------------------------------
>
>                 Key: LUCENE-2795
>                 URL: https://issues.apache.org/jira/browse/LUCENE-2795
>             Project: Lucene - Java
>          Issue Type: Improvement
>          Components: core/store
>            Reporter: Michael McCandless
>            Assignee: Varun Thacker
>              Labels: gsoc2011, lucene-gsoc-11, mentor
>         Attachments: LUCENE-2795.patch, LUCENE-2795.patch, LUCENE-2795.patch, LUCENE-2795.patch, LUCENE-2795.patch
>
>
> Today DirectIOLinuxDir is tricky/dangerous to use, because you only want to use it for indexWriter and not IndexReader (searching).  It's a trap.
> But, once we do LUCENE-2793, we can make it fully general purpose because then a single native Dir impl can be used.
> I'd also like to make it generic to other Unices, if we can, so that it becomes UnixDirectory.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


[jira] [Updated] (LUCENE-2795) Genericize DirectIOLinuxDir -> UnixDir

Posted by "Simon Willnauer (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/LUCENE-2795?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Simon Willnauer updated LUCENE-2795:
------------------------------------

    Attachment: LUCENE-2795.patch

here is an updated patch that works on OSX. I had to copy the old DirectIOLinuxDirectory.java to NativeUnixDirectory.java (svn copy) to apply the patch cleanly.

I added an additional path to the JDKs header files and a linker argument otherwise I got a linker error. GCC by default doesn't pass -lstdc++ to the linker


> Genericize DirectIOLinuxDir -> UnixDir
> --------------------------------------
>
>                 Key: LUCENE-2795
>                 URL: https://issues.apache.org/jira/browse/LUCENE-2795
>             Project: Lucene - Java
>          Issue Type: Improvement
>          Components: core/store
>            Reporter: Michael McCandless
>            Assignee: Varun Thacker
>              Labels: gsoc2011, lucene-gsoc-11, mentor
>         Attachments: LUCENE-2795.patch, LUCENE-2795.patch, LUCENE-2795.patch, LUCENE-2795.patch, LUCENE-2795.patch
>
>
> Today DirectIOLinuxDir is tricky/dangerous to use, because you only want to use it for indexWriter and not IndexReader (searching).  It's a trap.
> But, once we do LUCENE-2793, we can make it fully general purpose because then a single native Dir impl can be used.
> I'd also like to make it generic to other Unices, if we can, so that it becomes UnixDirectory.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


[jira] [Commented] (LUCENE-2795) Genericize DirectIOLinuxDir -> UnixDir

Posted by "Michael McCandless (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/LUCENE-2795?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13068636#comment-13068636 ] 

Michael McCandless commented on LUCENE-2795:
--------------------------------------------

I think NativeUnixDir should have a ctor that takes no buffer size, and uses the default?  And merging should always use the forcedBufferSize (let's rename it to mergeBufferSize?), ie, never delegate to BII.bufferSize(context).

Then, NativeUnixIndexInput/Output shouldn't have 2 ctors, only 1 (that takes bufferSize) and NativeUnixDir just passes the buffer size?

Somehow, but this should be a different issue, this Dir impl should only provide the II/IO impls for use during merging; if it's not a merge it should delegate to another Dir impl.  Maybe we pass a delegate to the ctor or something...

> Genericize DirectIOLinuxDir -> UnixDir
> --------------------------------------
>
>                 Key: LUCENE-2795
>                 URL: https://issues.apache.org/jira/browse/LUCENE-2795
>             Project: Lucene - Java
>          Issue Type: Improvement
>          Components: core/store
>            Reporter: Michael McCandless
>            Assignee: Varun Thacker
>              Labels: gsoc2011, lucene-gsoc-11, mentor
>         Attachments: LUCENE-2795.patch, LUCENE-2795.patch, LUCENE-2795.patch, LUCENE-2795.patch, LUCENE-2795.patch, LUCENE-2795.patch
>
>
> Today DirectIOLinuxDir is tricky/dangerous to use, because you only want to use it for indexWriter and not IndexReader (searching).  It's a trap.
> But, once we do LUCENE-2793, we can make it fully general purpose because then a single native Dir impl can be used.
> I'd also like to make it generic to other Unices, if we can, so that it becomes UnixDirectory.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


[jira] [Commented] (LUCENE-2795) Genericize DirectIOLinuxDir -> UnixDir

Posted by "Mark Miller (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/LUCENE-2795?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13060216#comment-13060216 ] 

Mark Miller commented on LUCENE-2795:
-------------------------------------

Hey Varun - saw you asked for someone with a mac to run some code for you in IRC but you popped off before I saw - what do you need? Just apply the patch and run the tests?

> Genericize DirectIOLinuxDir -> UnixDir
> --------------------------------------
>
>                 Key: LUCENE-2795
>                 URL: https://issues.apache.org/jira/browse/LUCENE-2795
>             Project: Lucene - Java
>          Issue Type: Improvement
>          Components: core/store
>            Reporter: Michael McCandless
>            Assignee: Varun Thacker
>              Labels: gsoc2011, lucene-gsoc-11, mentor
>         Attachments: LUCENE-2795.patch
>
>
> Today DirectIOLinuxDir is tricky/dangerous to use, because you only want to use it for indexWriter and not IndexReader (searching).  It's a trap.
> But, once we do LUCENE-2793, we can make it fully general purpose because then a single native Dir impl can be used.
> I'd also like to make it generic to other Unices, if we can, so that it becomes UnixDirectory.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


[jira] Updated: (LUCENE-2795) Genericize DirectIOLinuxDir -> UnixDir

Posted by "Simon Willnauer (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/LUCENE-2795?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Simon Willnauer updated LUCENE-2795:
------------------------------------

    Labels: gsoc2011 lucene-gsoc-11 mentor  (was: gsoc2011, lucene-gsoc-11 mentor,)

> Genericize DirectIOLinuxDir -> UnixDir
> --------------------------------------
>
>                 Key: LUCENE-2795
>                 URL: https://issues.apache.org/jira/browse/LUCENE-2795
>             Project: Lucene - Java
>          Issue Type: Improvement
>          Components: Store
>            Reporter: Michael McCandless
>              Labels: gsoc2011, lucene-gsoc-11, mentor
>
> Today DirectIOLinuxDir is tricky/dangerous to use, because you only want to use it for indexWriter and not IndexReader (searching).  It's a trap.
> But, once we do LUCENE-2793, we can make it fully general purpose because then a single native Dir impl can be used.
> I'd also like to make it generic to other Unices, if we can, so that it becomes UnixDirectory.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


[jira] [Commented] (LUCENE-2795) Genericize DirectIOLinuxDir -> UnixDir

Posted by "Michael McCandless (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/LUCENE-2795?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13023614#comment-13023614 ] 

Michael McCandless commented on LUCENE-2795:
--------------------------------------------

It looks like recent Linux kernels have better behavior with SEQUENTIAL flag: http://blog.mikemccandless.com/2010/06/lucene-and-fadvisemadvise.html?showComment=1303235497682#c2572106601600642254

If possible we should test on kernels after that patch was merged to see if passing SEQUENTIAL for merging prevents eviction of hot pages being used for searching...

> Genericize DirectIOLinuxDir -> UnixDir
> --------------------------------------
>
>                 Key: LUCENE-2795
>                 URL: https://issues.apache.org/jira/browse/LUCENE-2795
>             Project: Lucene - Java
>          Issue Type: Improvement
>          Components: Store
>            Reporter: Michael McCandless
>            Assignee: Simon Willnauer
>              Labels: gsoc2011, lucene-gsoc-11, mentor
>
> Today DirectIOLinuxDir is tricky/dangerous to use, because you only want to use it for indexWriter and not IndexReader (searching).  It's a trap.
> But, once we do LUCENE-2793, we can make it fully general purpose because then a single native Dir impl can be used.
> I'd also like to make it generic to other Unices, if we can, so that it becomes UnixDirectory.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


[jira] [Updated] (LUCENE-2795) Genericize DirectIOLinuxDir -> UnixDir

Posted by "Varun Thacker (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/LUCENE-2795?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Varun Thacker updated LUCENE-2795:
----------------------------------

    Attachment: LUCENE-2795.patch

Changed DirectIOLinuxDirectory to NativeUnixDirectory and also made some other changes.

> Genericize DirectIOLinuxDir -> UnixDir
> --------------------------------------
>
>                 Key: LUCENE-2795
>                 URL: https://issues.apache.org/jira/browse/LUCENE-2795
>             Project: Lucene - Java
>          Issue Type: Improvement
>          Components: core/store
>            Reporter: Michael McCandless
>            Assignee: Varun Thacker
>              Labels: gsoc2011, lucene-gsoc-11, mentor
>         Attachments: LUCENE-2795.patch, LUCENE-2795.patch, LUCENE-2795.patch, LUCENE-2795.patch
>
>
> Today DirectIOLinuxDir is tricky/dangerous to use, because you only want to use it for indexWriter and not IndexReader (searching).  It's a trap.
> But, once we do LUCENE-2793, we can make it fully general purpose because then a single native Dir impl can be used.
> I'd also like to make it generic to other Unices, if we can, so that it becomes UnixDirectory.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


[jira] [Commented] (LUCENE-2795) Genericize DirectIOLinuxDir -> UnixDir

Posted by "Michael McCandless (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/LUCENE-2795?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13066619#comment-13066619 ] 

Michael McCandless commented on LUCENE-2795:
--------------------------------------------

OK, so now we create a FileDescriptor obj in C and return that, by
setting the fd field directly?  This is because fadvise/posix_fadvise
on an already opened fd isn't portable enough?  Ie we must open the
file w/ the flags, instead, for some OS's?

I don't think you need posix_fadvise.h at all?  Can't you just
#include the right headers per OS, inside each OS's ifdef?

Missing some { } around single-statement if bodies.  Can you move the
"return NULL" in "if (class_ioex == NULL) return NULL;" to the next
line, with { }?


> Genericize DirectIOLinuxDir -> UnixDir
> --------------------------------------
>
>                 Key: LUCENE-2795
>                 URL: https://issues.apache.org/jira/browse/LUCENE-2795
>             Project: Lucene - Java
>          Issue Type: Improvement
>          Components: core/store
>            Reporter: Michael McCandless
>            Assignee: Varun Thacker
>              Labels: gsoc2011, lucene-gsoc-11, mentor
>         Attachments: LUCENE-2795.patch, LUCENE-2795.patch, LUCENE-2795.patch
>
>
> Today DirectIOLinuxDir is tricky/dangerous to use, because you only want to use it for indexWriter and not IndexReader (searching).  It's a trap.
> But, once we do LUCENE-2793, we can make it fully general purpose because then a single native Dir impl can be used.
> I'd also like to make it generic to other Unices, if we can, so that it becomes UnixDirectory.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


[jira] [Commented] (LUCENE-2795) Genericize DirectIOLinuxDir -> UnixDir

Posted by "Michael McCandless (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/LUCENE-2795?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13070644#comment-13070644 ] 

Michael McCandless commented on LUCENE-2795:
--------------------------------------------

Getting closer!

Instead of {{#define LINUX "linux"}} you can just do {{#define LINUX}}.

Can you rename forcedBufferSize -> mergeBufferSize in all places?
(And also the default constant FORCED_...).

I don't think "0" should mean we fallback to BII's default; ie, just
remove that logic and always use mergeBufferSize, and fix the jdoc.

How come you're passing {{true}} as the readOnly arg to
open_direct/_normal in createOutput?  Seems like this will fail?  (Hmm
did you run tests w/ these changes?).


> Genericize DirectIOLinuxDir -> UnixDir
> --------------------------------------
>
>                 Key: LUCENE-2795
>                 URL: https://issues.apache.org/jira/browse/LUCENE-2795
>             Project: Lucene - Java
>          Issue Type: Improvement
>          Components: core/store
>            Reporter: Michael McCandless
>            Assignee: Varun Thacker
>              Labels: gsoc2011, lucene-gsoc-11, mentor
>         Attachments: LUCENE-2795.patch, LUCENE-2795.patch, LUCENE-2795.patch, LUCENE-2795.patch, LUCENE-2795.patch, LUCENE-2795.patch, LUCENE-2795.patch
>
>
> Today DirectIOLinuxDir is tricky/dangerous to use, because you only want to use it for indexWriter and not IndexReader (searching).  It's a trap.
> But, once we do LUCENE-2793, we can make it fully general purpose because then a single native Dir impl can be used.
> I'd also like to make it generic to other Unices, if we can, so that it becomes UnixDirectory.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


[jira] [Updated] (LUCENE-2795) Genericize DirectIOLinuxDir -> UnixDir

Posted by "Varun Thacker (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/LUCENE-2795?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Varun Thacker updated LUCENE-2795:
----------------------------------

    Attachment: LUCENE-2795.patch

Sorry for the delay. I have made the changes suggested.

> Genericize DirectIOLinuxDir -> UnixDir
> --------------------------------------
>
>                 Key: LUCENE-2795
>                 URL: https://issues.apache.org/jira/browse/LUCENE-2795
>             Project: Lucene - Java
>          Issue Type: Improvement
>          Components: core/store
>            Reporter: Michael McCandless
>            Assignee: Varun Thacker
>              Labels: gsoc2011, lucene-gsoc-11, mentor
>         Attachments: LUCENE-2795.patch, LUCENE-2795.patch, LUCENE-2795.patch, LUCENE-2795.patch, LUCENE-2795.patch, LUCENE-2795.patch, LUCENE-2795.patch
>
>
> Today DirectIOLinuxDir is tricky/dangerous to use, because you only want to use it for indexWriter and not IndexReader (searching).  It's a trap.
> But, once we do LUCENE-2793, we can make it fully general purpose because then a single native Dir impl can be used.
> I'd also like to make it generic to other Unices, if we can, so that it becomes UnixDirectory.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


[jira] [Updated] (LUCENE-2795) Genericize DirectIOLinuxDir -> UnixDir

Posted by "Michael McCandless (Updated) (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/LUCENE-2795?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Michael McCandless updated LUCENE-2795:
---------------------------------------

    Attachment: LUCENE-2795.patch

Hi Varun!

Thanks for checking back here...

I ran this directory on Linux and hit a number of test failures, from
two bugs in the directory impl.  First,
NativeUnixIndexInput.fileLength() failed to include bytes still in the
buffer; second, on clone, if the current II was positioned at EOF
and EOF was on a (512 byte) page boundary then we'd hit
a false IOE.  I fixed those...

I also changed NativeUnixDir to take a delegate Dir which we fall back
to when we don't want to use direct IO; this enabled removing of
open_normal since we just use the "normal" java Dir impls for this.

To run all tests on Linux I do this:
{noformat}
export CLASSPATH=/lucene/unixdir/lucene/build/contrib/misc/lucene-misc-4.0-SNAPSHOT.jar
export LD_LIBRARY_PATH=/lucene/unixdir/lucene/build/native:/usr/local/lib
ant test-core -Dtests.directory=NativeUnixDirectory
{noformat}

(Change the path to your full path... or maybe use relative path but
that could be dangerous since tests may change the CWD... not sure).

It also requires a temporary ctor in the directory taking only File
and using FSDirectory.open for the delegate directory.

Finally I added another setting, minBytesDirect: if the file to open
(in openInput), or expected merge size (in createOutput), is smaller
than this, then we don't use direct IO.

To test this... I temporarily fixed the dir to always use direct IO,
and tests (eventually: ~75 minutes!) passed, except for some Solr tests which look
like timeout problems.  I temporarily dropped the default
buffer size to 4 KB (from 256 KB) else the JVM ran out of direct
buffer space.

I'll test on OS X as well...

                
> Genericize DirectIOLinuxDir -> UnixDir
> --------------------------------------
>
>                 Key: LUCENE-2795
>                 URL: https://issues.apache.org/jira/browse/LUCENE-2795
>             Project: Lucene - Java
>          Issue Type: Improvement
>          Components: core/store
>            Reporter: Michael McCandless
>            Assignee: Varun Thacker
>              Labels: gsoc2011, lucene-gsoc-11, mentor
>         Attachments: LUCENE-2795.patch, LUCENE-2795.patch, LUCENE-2795.patch, LUCENE-2795.patch, LUCENE-2795.patch, LUCENE-2795.patch, LUCENE-2795.patch, LUCENE-2795.patch, LUCENE-2795.patch, LUCENE-2795.patch
>
>
> Today DirectIOLinuxDir is tricky/dangerous to use, because you only want to use it for indexWriter and not IndexReader (searching).  It's a trap.
> But, once we do LUCENE-2793, we can make it fully general purpose because then a single native Dir impl can be used.
> I'd also like to make it generic to other Unices, if we can, so that it becomes UnixDirectory.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


[jira] [Updated] (LUCENE-2795) Genericize DirectIOLinuxDir -> UnixDir

Posted by "Varun Thacker (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/LUCENE-2795?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Varun Thacker updated LUCENE-2795:
----------------------------------

    Attachment: LUCENE-2795.patch

Updated jdocs.

createOutput switches between direct and normal.

Added constructors NativeUnixIndexOutput/NativeUnixIndexInput which does not take buffersize and uses 256kb by default.

Changed logic of using forcedbuffersize as suggested - It is only used in MERGE context rest of the times BII.buffersize

I am still trying to figure out how to find out different kernel versions in this case. This is what I found out till now. Use the system("uname -r") call in the c program directly. Use a makefile. Still haven't figured out how to do so using cpptasks though. 


> Genericize DirectIOLinuxDir -> UnixDir
> --------------------------------------
>
>                 Key: LUCENE-2795
>                 URL: https://issues.apache.org/jira/browse/LUCENE-2795
>             Project: Lucene - Java
>          Issue Type: Improvement
>          Components: core/store
>            Reporter: Michael McCandless
>            Assignee: Varun Thacker
>              Labels: gsoc2011, lucene-gsoc-11, mentor
>         Attachments: LUCENE-2795.patch, LUCENE-2795.patch, LUCENE-2795.patch, LUCENE-2795.patch, LUCENE-2795.patch, LUCENE-2795.patch
>
>
> Today DirectIOLinuxDir is tricky/dangerous to use, because you only want to use it for indexWriter and not IndexReader (searching).  It's a trap.
> But, once we do LUCENE-2793, we can make it fully general purpose because then a single native Dir impl can be used.
> I'd also like to make it generic to other Unices, if we can, so that it becomes UnixDirectory.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


[jira] [Commented] (LUCENE-2795) Genericize DirectIOLinuxDir -> UnixDir

Posted by "Varun Thacker (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/LUCENE-2795?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13117404#comment-13117404 ] 

Varun Thacker commented on LUCENE-2795:
---------------------------------------

Hi,

I think this patch is almost complete. It will be great if anyone can review it. 
                
> Genericize DirectIOLinuxDir -> UnixDir
> --------------------------------------
>
>                 Key: LUCENE-2795
>                 URL: https://issues.apache.org/jira/browse/LUCENE-2795
>             Project: Lucene - Java
>          Issue Type: Improvement
>          Components: core/store
>            Reporter: Michael McCandless
>            Assignee: Varun Thacker
>              Labels: gsoc2011, lucene-gsoc-11, mentor
>         Attachments: LUCENE-2795.patch, LUCENE-2795.patch, LUCENE-2795.patch, LUCENE-2795.patch, LUCENE-2795.patch, LUCENE-2795.patch, LUCENE-2795.patch, LUCENE-2795.patch
>
>
> Today DirectIOLinuxDir is tricky/dangerous to use, because you only want to use it for indexWriter and not IndexReader (searching).  It's a trap.
> But, once we do LUCENE-2793, we can make it fully general purpose because then a single native Dir impl can be used.
> I'd also like to make it generic to other Unices, if we can, so that it becomes UnixDirectory.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


[jira] [Updated] (LUCENE-2795) Genericize DirectIOLinuxDir -> UnixDir

Posted by "Varun Thacker (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/LUCENE-2795?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Varun Thacker updated LUCENE-2795:
----------------------------------

    Attachment: LUCENE-2795.patch

the open_direct, posix_fadvise function in onlylinux.h remains the same. 

In onlybsd.h open_direct, posix_fadvise are the same too except that the O_NOATIME flag is not present. 

In onlyosx the open_direct is implemented in a different way. 

Also I have added a open_normal function to all of the headers which will be used in case the IOContext in not a MERGE.

> Genericize DirectIOLinuxDir -> UnixDir
> --------------------------------------
>
>                 Key: LUCENE-2795
>                 URL: https://issues.apache.org/jira/browse/LUCENE-2795
>             Project: Lucene - Java
>          Issue Type: Improvement
>          Components: core/store
>            Reporter: Michael McCandless
>            Assignee: Varun Thacker
>              Labels: gsoc2011, lucene-gsoc-11, mentor
>         Attachments: LUCENE-2795.patch, LUCENE-2795.patch
>
>
> Today DirectIOLinuxDir is tricky/dangerous to use, because you only want to use it for indexWriter and not IndexReader (searching).  It's a trap.
> But, once we do LUCENE-2793, we can make it fully general purpose because then a single native Dir impl can be used.
> I'd also like to make it generic to other Unices, if we can, so that it becomes UnixDirectory.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org