You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@lucene.apache.org by "Michael McCandless (JIRA)" <ji...@apache.org> on 2011/06/07 16:39:58 UTC

[jira] [Created] (LUCENE-3178) Native MMapDir

Native MMapDir
--------------

                 Key: LUCENE-3178
                 URL: https://issues.apache.org/jira/browse/LUCENE-3178
             Project: Lucene - Java
          Issue Type: Improvement
          Components: core/store
            Reporter: Michael McCandless


Spinoff from LUCENE-2793.

Just like we will create native Dir impl (UnixDirectory) to pass the right OS level IO flags depending on the IOContext, we could in theory do something similar with MMapDir.

The problem is MMap is apparently quite hairy... and to pass the flags the native code would need to invoke mmap (I think?), unlike UnixDir where the code "only" has to open the file handle.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


[jira] [Commented] (LUCENE-3178) Native MMapDir

Posted by "Simon Willnauer (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/LUCENE-3178?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13095349#comment-13095349 ] 

Simon Willnauer commented on LUCENE-3178:
-----------------------------------------

bq. Also slightly off topic but the the javadocs for NMapoDir#openInput still shows bufferSize as a parameter. In the java file nothing is specified as the @param values. Where is it coming from? It's probably my mistake from LUCENE-2793 but I would like to correct it here.

I don't see the @param, which file r u referring to?



> Native MMapDir
> --------------
>
>                 Key: LUCENE-3178
>                 URL: https://issues.apache.org/jira/browse/LUCENE-3178
>             Project: Lucene - Java
>          Issue Type: Improvement
>          Components: core/store
>            Reporter: Michael McCandless
>
> Spinoff from LUCENE-2793.
> Just like we will create native Dir impl (UnixDirectory) to pass the right OS level IO flags depending on the IOContext, we could in theory do something similar with MMapDir.
> The problem is MMap is apparently quite hairy... and to pass the flags the native code would need to invoke mmap (I think?), unlike UnixDir where the code "only" has to open the file handle.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


[jira] [Commented] (LUCENE-3178) Native MMapDir

Posted by "Varun Thacker (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/LUCENE-3178?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13095341#comment-13095341 ] 

Varun Thacker commented on LUCENE-3178:
---------------------------------------

If we pass down IOContext to NMapIndexInput and in the ctor use mmap and then use madvise with the appropriate flag ( depending on the Context). Is that the correct way to go about it ?

Also slightly off topic but the the javadocs for NMapoDir#openInput still shows bufferSize as a parameter. In the java file nothing is specified as the @param values. Where is it coming from? It's probably my mistake from LUCENE-2793 but I would like to correct it here. 

> Native MMapDir
> --------------
>
>                 Key: LUCENE-3178
>                 URL: https://issues.apache.org/jira/browse/LUCENE-3178
>             Project: Lucene - Java
>          Issue Type: Improvement
>          Components: core/store
>            Reporter: Michael McCandless
>
> Spinoff from LUCENE-2793.
> Just like we will create native Dir impl (UnixDirectory) to pass the right OS level IO flags depending on the IOContext, we could in theory do something similar with MMapDir.
> The problem is MMap is apparently quite hairy... and to pass the flags the native code would need to invoke mmap (I think?), unlike UnixDir where the code "only" has to open the file handle.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


[jira] [Updated] (LUCENE-3178) Native MMapDir

Posted by "Michael McCandless (Updated) (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/LUCENE-3178?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Michael McCandless updated LUCENE-3178:
---------------------------------------

    Labels: gsoc2012 lucene-gsoc-12  (was: )
    
> Native MMapDir
> --------------
>
>                 Key: LUCENE-3178
>                 URL: https://issues.apache.org/jira/browse/LUCENE-3178
>             Project: Lucene - Java
>          Issue Type: Improvement
>          Components: core/store
>            Reporter: Michael McCandless
>              Labels: gsoc2012, lucene-gsoc-12
>
> Spinoff from LUCENE-2793.
> Just like we will create native Dir impl (UnixDirectory) to pass the right OS level IO flags depending on the IOContext, we could in theory do something similar with MMapDir.
> The problem is MMap is apparently quite hairy... and to pass the flags the native code would need to invoke mmap (I think?), unlike UnixDir where the code "only" has to open the file handle.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


[jira] [Commented] (LUCENE-3178) Native MMapDir

Posted by "Varun Thacker (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/LUCENE-3178?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13094733#comment-13094733 ] 

Varun Thacker commented on LUCENE-3178:
---------------------------------------

Can we use the NativePosixUtil class and call the posix_madvise/madvise methods?
NMapDir#openInput() and NMapDir#createSlicer takes a IOContext. I'm not quite sure what createSlicer does though?

> Native MMapDir
> --------------
>
>                 Key: LUCENE-3178
>                 URL: https://issues.apache.org/jira/browse/LUCENE-3178
>             Project: Lucene - Java
>          Issue Type: Improvement
>          Components: core/store
>            Reporter: Michael McCandless
>
> Spinoff from LUCENE-2793.
> Just like we will create native Dir impl (UnixDirectory) to pass the right OS level IO flags depending on the IOContext, we could in theory do something similar with MMapDir.
> The problem is MMap is apparently quite hairy... and to pass the flags the native code would need to invoke mmap (I think?), unlike UnixDir where the code "only" has to open the file handle.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


[jira] [Commented] (LUCENE-3178) Native MMapDir

Posted by "Robert Muir (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/LUCENE-3178?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13094740#comment-13094740 ] 

Robert Muir commented on LUCENE-3178:
-------------------------------------

It might even be worth doing the mmap from NativePosixUtil, dealing with all the round-to-pagesize etc that you need, and accessing it with sun.misc.Unsafe.

I did a little prototype a while back that stole the address from MappedByteBuffer and used Unsafe for all ops with no bounds checks, and the performance improvements were pretty interesting :) But the problem with that approach is you still can't FileChannel.map a file > Integer.MAX_VALUE, meaning we have to handle all the stupidity of multiple mappings, but I think with a native mmap call you could just map the whole thing and avoid this hassle...


> Native MMapDir
> --------------
>
>                 Key: LUCENE-3178
>                 URL: https://issues.apache.org/jira/browse/LUCENE-3178
>             Project: Lucene - Java
>          Issue Type: Improvement
>          Components: core/store
>            Reporter: Michael McCandless
>
> Spinoff from LUCENE-2793.
> Just like we will create native Dir impl (UnixDirectory) to pass the right OS level IO flags depending on the IOContext, we could in theory do something similar with MMapDir.
> The problem is MMap is apparently quite hairy... and to pass the flags the native code would need to invoke mmap (I think?), unlike UnixDir where the code "only" has to open the file handle.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


[jira] [Commented] (LUCENE-3178) Native MMapDir

Posted by "Robert Muir (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/LUCENE-3178?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13094754#comment-13094754 ] 

Robert Muir commented on LUCENE-3178:
-------------------------------------

Here are the old benchmarks from such a technique, with no bounds checks (only asserts) and unsafe, versus using a mappedbytebuffer
||Task||QPS base||StdDev base||QPS unsafemmap||StdDev unsafemmap||Pct diff||
|Fuzzy2|21.83|0.58|21.94|1.32|-7% - 9%|
|Respell|25.68|0.13|26.01|0.91|-2% - 5%|
|Fuzzy1|27.70|0.78|28.39|1.48|-5% - 10%|
|TermGroup1M|35.96|1.38|38.92|0.53|2% - 14%|
|PKLookup|41.56|1.05|46.04|1.82|3% - 18%|
|SloppyPhrase|7.06|0.26|7.93|0.43|2% - 22%|
|TermBGroup1M|29.09|1.57|32.70|0.70|4% - 21%|
|TermBGroup1M1P|32.13|1.94|36.86|0.44|6% - 23%|
|SpanNear|6.71|0.12|7.89|0.13|13% - 21%|
|Wildcard|37.62|3.83|44.39|1.41|3% - 35%|
|AndHighHigh|14.53|0.50|17.56|1.12|9% - 33%|
|Phrase|12.20|0.63|14.82|0.35|12% - 31%|
|OrHighHigh|11.77|0.79|14.31|0.26|11% - 32%|
|OrHighMed|11.49|0.75|14.02|0.26|12% - 32%|
|Prefix3|32.70|4.10|40.06|1.77|4% - 46%|
|Term|92.02|6.37|114.13|1.68|14% - 35%|
|AndHighMed|55.38|1.60|69.02|5.48|11% - 38%|
|IntNRQ|7.17|1.19|8.96|0.63|0% - 60%|


> Native MMapDir
> --------------
>
>                 Key: LUCENE-3178
>                 URL: https://issues.apache.org/jira/browse/LUCENE-3178
>             Project: Lucene - Java
>          Issue Type: Improvement
>          Components: core/store
>            Reporter: Michael McCandless
>
> Spinoff from LUCENE-2793.
> Just like we will create native Dir impl (UnixDirectory) to pass the right OS level IO flags depending on the IOContext, we could in theory do something similar with MMapDir.
> The problem is MMap is apparently quite hairy... and to pass the flags the native code would need to invoke mmap (I think?), unlike UnixDir where the code "only" has to open the file handle.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


[jira] [Commented] (LUCENE-3178) Native MMapDir

Posted by "Varun Thacker (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/LUCENE-3178?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13097182#comment-13097182 ] 

Varun Thacker commented on LUCENE-3178:
---------------------------------------

bq. If we pass down IOContext to NMapIndexInput and in the ctor use mmap and then use madvise with the appropriate flag ( depending on the Context). Is that the correct way to go about it ?

Any suggestions on this?

> Native MMapDir
> --------------
>
>                 Key: LUCENE-3178
>                 URL: https://issues.apache.org/jira/browse/LUCENE-3178
>             Project: Lucene - Java
>          Issue Type: Improvement
>          Components: core/store
>            Reporter: Michael McCandless
>
> Spinoff from LUCENE-2793.
> Just like we will create native Dir impl (UnixDirectory) to pass the right OS level IO flags depending on the IOContext, we could in theory do something similar with MMapDir.
> The problem is MMap is apparently quite hairy... and to pass the flags the native code would need to invoke mmap (I think?), unlike UnixDir where the code "only" has to open the file handle.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


[jira] [Commented] (LUCENE-3178) Native MMapDir

Posted by "Michael McCandless (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/LUCENE-3178?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13045526#comment-13045526 ] 

Michael McCandless commented on LUCENE-3178:
--------------------------------------------

I think we want to call madvise, and not change the flags passed to the original mmap invocation?  But I'm not sure...

> Native MMapDir
> --------------
>
>                 Key: LUCENE-3178
>                 URL: https://issues.apache.org/jira/browse/LUCENE-3178
>             Project: Lucene - Java
>          Issue Type: Improvement
>          Components: core/store
>            Reporter: Michael McCandless
>
> Spinoff from LUCENE-2793.
> Just like we will create native Dir impl (UnixDirectory) to pass the right OS level IO flags depending on the IOContext, we could in theory do something similar with MMapDir.
> The problem is MMap is apparently quite hairy... and to pass the flags the native code would need to invoke mmap (I think?), unlike UnixDir where the code "only" has to open the file handle.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


[jira] [Commented] (LUCENE-3178) Native MMapDir

Posted by "Robert Muir (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/LUCENE-3178?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13045461#comment-13045461 ] 

Robert Muir commented on LUCENE-3178:
-------------------------------------

can the flags you need all be set with madvise() or are some only available as flags to mmap() ?

If so, it might not be that bad.

> Native MMapDir
> --------------
>
>                 Key: LUCENE-3178
>                 URL: https://issues.apache.org/jira/browse/LUCENE-3178
>             Project: Lucene - Java
>          Issue Type: Improvement
>          Components: core/store
>            Reporter: Michael McCandless
>
> Spinoff from LUCENE-2793.
> Just like we will create native Dir impl (UnixDirectory) to pass the right OS level IO flags depending on the IOContext, we could in theory do something similar with MMapDir.
> The problem is MMap is apparently quite hairy... and to pass the flags the native code would need to invoke mmap (I think?), unlike UnixDir where the code "only" has to open the file handle.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


[jira] [Commented] (LUCENE-3178) Native MMapDir

Posted by "Varun Thacker (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/LUCENE-3178?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13095356#comment-13095356 ] 

Varun Thacker commented on LUCENE-3178:
---------------------------------------

Just realized what was wrong. I was using the javadocs from version 3.3 on not from the trunk. Stupid mistake on my part. Sorry to bring that up.

> Native MMapDir
> --------------
>
>                 Key: LUCENE-3178
>                 URL: https://issues.apache.org/jira/browse/LUCENE-3178
>             Project: Lucene - Java
>          Issue Type: Improvement
>          Components: core/store
>            Reporter: Michael McCandless
>
> Spinoff from LUCENE-2793.
> Just like we will create native Dir impl (UnixDirectory) to pass the right OS level IO flags depending on the IOContext, we could in theory do something similar with MMapDir.
> The problem is MMap is apparently quite hairy... and to pass the flags the native code would need to invoke mmap (I think?), unlike UnixDir where the code "only" has to open the file handle.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org