You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@lucene.apache.org by "Robert Muir (JIRA)" <ji...@apache.org> on 2010/12/01 15:19:13 UTC

[jira] Created: (LUCENE-2787) disable atime for DirectIOLinuxDirectory

disable atime for DirectIOLinuxDirectory
----------------------------------------

                 Key: LUCENE-2787
                 URL: https://issues.apache.org/jira/browse/LUCENE-2787
             Project: Lucene - Java
          Issue Type: Improvement
          Components: contrib/*
            Reporter: Robert Muir
             Fix For: 3.1, 4.0


In Linux's open():
O_NOATIME
    (Since Linux 2.6.8) Do not update the file last access time (st_atime in the inode) when the file is read(2). This flag is intended for use by indexing or backup programs, where its use can significantly reduce the amount of disk activity. This flag may not be effective on all filesystems. One example is NFS, where the server maintains the access time.

So we should do this in our linux-specific DirectIOLinuxDirectory.

Separately (offtopic), it would be better if this was a LinuxDirectory that only uses O_DIRECT when it should :)
It would be nice to think about an optional modules/native for common platforms similar to what tomcat provides
Its easier to test directories like this now (-Dtests.directory)...


-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


[jira] Commented: (LUCENE-2787) disable atime for DirectIOLinuxDirectory

Posted by "Simon Willnauer (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/LUCENE-2787?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12965695#action_12965695 ] 

Simon Willnauer commented on LUCENE-2787:
-----------------------------------------

robert, you can also control this through mount options / how you mount your filesystems with setting the noatime option on the mount command do you think this is absolutely necessary to set this in here by default? 

simon

> disable atime for DirectIOLinuxDirectory
> ----------------------------------------
>
>                 Key: LUCENE-2787
>                 URL: https://issues.apache.org/jira/browse/LUCENE-2787
>             Project: Lucene - Java
>          Issue Type: Improvement
>          Components: contrib/*
>            Reporter: Robert Muir
>             Fix For: 3.1, 4.0
>
>         Attachments: LUCENE-2787.patch
>
>
> In Linux's open():
> O_NOATIME
>     (Since Linux 2.6.8) Do not update the file last access time (st_atime in the inode) when the file is read(2). This flag is intended for use by indexing or backup programs, where its use can significantly reduce the amount of disk activity. This flag may not be effective on all filesystems. One example is NFS, where the server maintains the access time.
> So we should do this in our linux-specific DirectIOLinuxDirectory.
> Separately (offtopic), it would be better if this was a LinuxDirectory that only uses O_DIRECT when it should :)
> It would be nice to think about an optional modules/native for common platforms similar to what tomcat provides
> Its easier to test directories like this now (-Dtests.directory)...

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


[jira] Commented: (LUCENE-2787) disable atime for DirectIOLinuxDirectory

Posted by "Robert Muir (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/LUCENE-2787?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12965705#action_12965705 ] 

Robert Muir commented on LUCENE-2787:
-------------------------------------

Uwe: I don't interpret it that way!

I don't think our indexinputs should be doing writes!

> disable atime for DirectIOLinuxDirectory
> ----------------------------------------
>
>                 Key: LUCENE-2787
>                 URL: https://issues.apache.org/jira/browse/LUCENE-2787
>             Project: Lucene - Java
>          Issue Type: Improvement
>          Components: contrib/*
>            Reporter: Robert Muir
>             Fix For: 3.1, 4.0
>
>         Attachments: LUCENE-2787.patch
>
>
> In Linux's open():
> O_NOATIME
>     (Since Linux 2.6.8) Do not update the file last access time (st_atime in the inode) when the file is read(2). This flag is intended for use by indexing or backup programs, where its use can significantly reduce the amount of disk activity. This flag may not be effective on all filesystems. One example is NFS, where the server maintains the access time.
> So we should do this in our linux-specific DirectIOLinuxDirectory.
> Separately (offtopic), it would be better if this was a LinuxDirectory that only uses O_DIRECT when it should :)
> It would be nice to think about an optional modules/native for common platforms similar to what tomcat provides
> Its easier to test directories like this now (-Dtests.directory)...

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


[jira] Commented: (LUCENE-2787) disable atime for DirectIOLinuxDirectory

Posted by "Michael McCandless (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/LUCENE-2787?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12965740#action_12965740 ] 

Michael McCandless commented on LUCENE-2787:
--------------------------------------------

+1, this is a no brainer.  I had no idea linux lets you turn off atime per file desriptor!

It's ridiculous that the OS maintains an atime on our index files.

Uwe, I agree about the intention of the man page (so eg back when contrib/benchmark used to write 10,000 files to run its tests, and then index them, we could've seen a big speedup :) ), but still it can't hurt to also turn it off when opening the index files for reading.

I think atime is updated per-read not just at open (http://lkml.org/lkml/1998/12/14/81) though I'm not sure.  Even so, it's presumably cached in the OS's write buffer and then only flushed periodically, so I don't think we'll see sizable gains here.  But every bit counts so I think we should do it.

> disable atime for DirectIOLinuxDirectory
> ----------------------------------------
>
>                 Key: LUCENE-2787
>                 URL: https://issues.apache.org/jira/browse/LUCENE-2787
>             Project: Lucene - Java
>          Issue Type: Improvement
>          Components: contrib/*
>            Reporter: Robert Muir
>             Fix For: 3.1, 4.0
>
>         Attachments: LUCENE-2787.patch
>
>
> In Linux's open():
> O_NOATIME
>     (Since Linux 2.6.8) Do not update the file last access time (st_atime in the inode) when the file is read(2). This flag is intended for use by indexing or backup programs, where its use can significantly reduce the amount of disk activity. This flag may not be effective on all filesystems. One example is NFS, where the server maintains the access time.
> So we should do this in our linux-specific DirectIOLinuxDirectory.
> Separately (offtopic), it would be better if this was a LinuxDirectory that only uses O_DIRECT when it should :)
> It would be nice to think about an optional modules/native for common platforms similar to what tomcat provides
> Its easier to test directories like this now (-Dtests.directory)...

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


[jira] Commented: (LUCENE-2787) disable atime for DirectIOLinuxDirectory

Posted by "Robert Muir (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/LUCENE-2787?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12965697#action_12965697 ] 

Robert Muir commented on LUCENE-2787:
-------------------------------------

Simon, of course you can, but why not set it? Our indexes don't need the atime for any reason.

The option exists specifically for apps like lucene... see the description from the man page!!!!

> disable atime for DirectIOLinuxDirectory
> ----------------------------------------
>
>                 Key: LUCENE-2787
>                 URL: https://issues.apache.org/jira/browse/LUCENE-2787
>             Project: Lucene - Java
>          Issue Type: Improvement
>          Components: contrib/*
>            Reporter: Robert Muir
>             Fix For: 3.1, 4.0
>
>         Attachments: LUCENE-2787.patch
>
>
> In Linux's open():
> O_NOATIME
>     (Since Linux 2.6.8) Do not update the file last access time (st_atime in the inode) when the file is read(2). This flag is intended for use by indexing or backup programs, where its use can significantly reduce the amount of disk activity. This flag may not be effective on all filesystems. One example is NFS, where the server maintains the access time.
> So we should do this in our linux-specific DirectIOLinuxDirectory.
> Separately (offtopic), it would be better if this was a LinuxDirectory that only uses O_DIRECT when it should :)
> It would be nice to think about an optional modules/native for common platforms similar to what tomcat provides
> Its easier to test directories like this now (-Dtests.directory)...

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


[jira] Updated: (LUCENE-2787) disable atime for DirectIOLinuxDirectory

Posted by "Robert Muir (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/LUCENE-2787?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Robert Muir updated LUCENE-2787:
--------------------------------

    Attachment: LUCENE-2787.patch

all core tests pass with this directory.

> disable atime for DirectIOLinuxDirectory
> ----------------------------------------
>
>                 Key: LUCENE-2787
>                 URL: https://issues.apache.org/jira/browse/LUCENE-2787
>             Project: Lucene - Java
>          Issue Type: Improvement
>          Components: contrib/*
>            Reporter: Robert Muir
>             Fix For: 3.1, 4.0
>
>         Attachments: LUCENE-2787.patch
>
>
> In Linux's open():
> O_NOATIME
>     (Since Linux 2.6.8) Do not update the file last access time (st_atime in the inode) when the file is read(2). This flag is intended for use by indexing or backup programs, where its use can significantly reduce the amount of disk activity. This flag may not be effective on all filesystems. One example is NFS, where the server maintains the access time.
> So we should do this in our linux-specific DirectIOLinuxDirectory.
> Separately (offtopic), it would be better if this was a LinuxDirectory that only uses O_DIRECT when it should :)
> It would be nice to think about an optional modules/native for common platforms similar to what tomcat provides
> Its easier to test directories like this now (-Dtests.directory)...

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


[jira] Commented: (LUCENE-2787) disable atime for DirectIOLinuxDirectory

Posted by "Uwe Schindler (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/LUCENE-2787?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12965701#action_12965701 ] 

Uwe Schindler commented on LUCENE-2787:
---------------------------------------

bq. The option exists specifically for apps like lucene... see the description from the man page!!!!

The intention behind the man page is not for the part of the app that manages the *index* itsself (like Lucene) - it is for the part of the app, that *reads* files *to index them* (so that would be the app that uses lucene and e.g. uses TIKA to read all files, this one should set noatime). The idea is to not mark the file as "accessed" when the virus scanner or the KDE/gnome file system browser indexes it.

Simon is right about setting it as a mount option.

> disable atime for DirectIOLinuxDirectory
> ----------------------------------------
>
>                 Key: LUCENE-2787
>                 URL: https://issues.apache.org/jira/browse/LUCENE-2787
>             Project: Lucene - Java
>          Issue Type: Improvement
>          Components: contrib/*
>            Reporter: Robert Muir
>             Fix For: 3.1, 4.0
>
>         Attachments: LUCENE-2787.patch
>
>
> In Linux's open():
> O_NOATIME
>     (Since Linux 2.6.8) Do not update the file last access time (st_atime in the inode) when the file is read(2). This flag is intended for use by indexing or backup programs, where its use can significantly reduce the amount of disk activity. This flag may not be effective on all filesystems. One example is NFS, where the server maintains the access time.
> So we should do this in our linux-specific DirectIOLinuxDirectory.
> Separately (offtopic), it would be better if this was a LinuxDirectory that only uses O_DIRECT when it should :)
> It would be nice to think about an optional modules/native for common platforms similar to what tomcat provides
> Its easier to test directories like this now (-Dtests.directory)...

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


[jira] Commented: (LUCENE-2787) disable atime for DirectIOLinuxDirectory

Posted by "Simon Willnauer (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/LUCENE-2787?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12965752#action_12965752 ] 

Simon Willnauer commented on LUCENE-2787:
-----------------------------------------

after all i think we should really do it. I can not think of any situation where you want atime to be updated here. It seems that lots of distributions use relatime which is smarter about it see: http://lwn.net/Articles/244829/

we should really document that on the wiki so that folks can check what their dist does or by default set it to noatime.

simon

> disable atime for DirectIOLinuxDirectory
> ----------------------------------------
>
>                 Key: LUCENE-2787
>                 URL: https://issues.apache.org/jira/browse/LUCENE-2787
>             Project: Lucene - Java
>          Issue Type: Improvement
>          Components: contrib/*
>            Reporter: Robert Muir
>             Fix For: 3.1, 4.0
>
>         Attachments: LUCENE-2787.patch
>
>
> In Linux's open():
> O_NOATIME
>     (Since Linux 2.6.8) Do not update the file last access time (st_atime in the inode) when the file is read(2). This flag is intended for use by indexing or backup programs, where its use can significantly reduce the amount of disk activity. This flag may not be effective on all filesystems. One example is NFS, where the server maintains the access time.
> So we should do this in our linux-specific DirectIOLinuxDirectory.
> Separately (offtopic), it would be better if this was a LinuxDirectory that only uses O_DIRECT when it should :)
> It would be nice to think about an optional modules/native for common platforms similar to what tomcat provides
> Its easier to test directories like this now (-Dtests.directory)...

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


[jira] Resolved: (LUCENE-2787) disable atime for DirectIOLinuxDirectory

Posted by "Robert Muir (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/LUCENE-2787?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Robert Muir resolved LUCENE-2787.
---------------------------------

    Resolution: Fixed
      Assignee: Robert Muir

Committed revision 1041954 (trunk), 1041957 (3x)

> disable atime for DirectIOLinuxDirectory
> ----------------------------------------
>
>                 Key: LUCENE-2787
>                 URL: https://issues.apache.org/jira/browse/LUCENE-2787
>             Project: Lucene - Java
>          Issue Type: Improvement
>          Components: contrib/*
>            Reporter: Robert Muir
>            Assignee: Robert Muir
>             Fix For: 3.1, 4.0
>
>         Attachments: LUCENE-2787.patch
>
>
> In Linux's open():
> O_NOATIME
>     (Since Linux 2.6.8) Do not update the file last access time (st_atime in the inode) when the file is read(2). This flag is intended for use by indexing or backup programs, where its use can significantly reduce the amount of disk activity. This flag may not be effective on all filesystems. One example is NFS, where the server maintains the access time.
> So we should do this in our linux-specific DirectIOLinuxDirectory.
> Separately (offtopic), it would be better if this was a LinuxDirectory that only uses O_DIRECT when it should :)
> It would be nice to think about an optional modules/native for common platforms similar to what tomcat provides
> Its easier to test directories like this now (-Dtests.directory)...

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


[jira] Commented: (LUCENE-2787) disable atime for DirectIOLinuxDirectory

Posted by "Robert Muir (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/LUCENE-2787?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12965700#action_12965700 ] 

Robert Muir commented on LUCENE-2787:
-------------------------------------

Also simon, i just wanted to say, you need to be root to change the mount option etc.

I think this is totally appropriate for us to do, again quoting from the page:

"This flag is intended for use by *indexing* or backup programs, where its use can significantly reduce the amount of disk activity."



> disable atime for DirectIOLinuxDirectory
> ----------------------------------------
>
>                 Key: LUCENE-2787
>                 URL: https://issues.apache.org/jira/browse/LUCENE-2787
>             Project: Lucene - Java
>          Issue Type: Improvement
>          Components: contrib/*
>            Reporter: Robert Muir
>             Fix For: 3.1, 4.0
>
>         Attachments: LUCENE-2787.patch
>
>
> In Linux's open():
> O_NOATIME
>     (Since Linux 2.6.8) Do not update the file last access time (st_atime in the inode) when the file is read(2). This flag is intended for use by indexing or backup programs, where its use can significantly reduce the amount of disk activity. This flag may not be effective on all filesystems. One example is NFS, where the server maintains the access time.
> So we should do this in our linux-specific DirectIOLinuxDirectory.
> Separately (offtopic), it would be better if this was a LinuxDirectory that only uses O_DIRECT when it should :)
> It would be nice to think about an optional modules/native for common platforms similar to what tomcat provides
> Its easier to test directories like this now (-Dtests.directory)...

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org