You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@lucene.apache.org by cu...@apache.org on 2004/09/28 23:40:11 UTC

cvs commit: jakarta-lucene/src/java/org/apache/lucene/store MMapDirectory.java

cutting     2004/09/28 14:40:11

  Added:       src/java/org/apache/lucene/store MMapDirectory.java
  Log:
  Add an nio mmap based Directory implementation.
  
  Revision  Changes    Path
  1.1                  jakarta-lucene/src/java/org/apache/lucene/store/MMapDirectory.java
  
  Index: MMapDirectory.java
  ===================================================================
  package org.apache.lucene.store;
  
  /**
   * Copyright 2004 The Apache Software Foundation
   *
   * Licensed under the Apache License, Version 2.0 (the "License");
   * you may not use this file except in compliance with the License.
   * You may obtain a copy of the License at
   *
   *     http://www.apache.org/licenses/LICENSE-2.0
   *
   * Unless required by applicable law or agreed to in writing, software
   * distributed under the License is distributed on an "AS IS" BASIS,
   * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
   * See the License for the specific language governing permissions and
   * limitations under the License.
   */
  
  import java.io.IOException;
  import java.io.File;
  import java.io.RandomAccessFile;
  import java.nio.ByteBuffer;
  import java.nio.channels.FileChannel;
  import java.nio.channels.FileChannel.MapMode;
  
  /** File-based {@link Directory} implementation that uses mmap for input.
   *
   * <p>To use this, invoke Java with the System property
   * org.apache.lucene.FSDirectory.class set to
   * org.apache.lucene.store.MMapDirectory.  This will cause {@link
   * FSDirectory#getDirectory(File,boolean)} to return instances of this class.
   *
   * @author Doug Cutting
   */
  public class MMapDirectory extends FSDirectory {
  
    private class MMapIndexInput extends IndexInput {
  
      private ByteBuffer buffer;
      private RandomAccessFile file;
      private long length;
      private boolean isClone;
  
      public MMapIndexInput(String path) throws IOException {
        this.file = new RandomAccessFile(path, "r");
        this.length = file.length();
        this.buffer = file.getChannel().map(MapMode.READ_ONLY, 0, length);
      }
  
      public byte readByte() throws IOException {
        return buffer.get();
      }
  
      public void readBytes(byte[] b, int offset, int len)
        throws IOException {
        buffer.get(b, offset, len);
      }
  
      public long getFilePointer() {
        return buffer.position();
      }
  
      public void seek(long pos) throws IOException {
        buffer.position((int)pos);
      }
  
      public long length() {
        return length;
      }
  
      public Object clone() {
        MMapIndexInput clone = (MMapIndexInput)super.clone();
        clone.isClone = true;
        clone.buffer = buffer.duplicate();
        return clone;
      }
  
      public void close() throws IOException {
        if (!isClone)
          file.close();
      }
    }
  
    private MMapDirectory() {}                      // no public ctor
  
    public IndexInput openInput(String name) throws IOException {
      return new MMapIndexInput(new File(getFile(), name).getPath());
    }
  }
  
  
  
  

---------------------------------------------------------------------
To unsubscribe, e-mail: lucene-dev-unsubscribe@jakarta.apache.org
For additional commands, e-mail: lucene-dev-help@jakarta.apache.org


Re: Using MMapDirectory fails TestCompoundFile; MMapDirectory for huge indexes

Posted by Paul Elschot <pa...@xs4all.nl>.
On Friday 01 October 2004 22:16, Doug Cutting wrote:
> Paul Elschot wrote:
> > I'm working on a memory mapped directory that uses multiple buffers
> > for large files.
>
> Great!
>
> There will be a small performance hit, as each call to readByte() will
> need to first check whether it's overflowed the current buffer, right?

Yes. I just simplified that test to a counter equals zero.
That counter also is decremented by readByte().

> > While trying some test runs I found that the current version fails a
> > test:
> >
> >     [junit] Testsuite: org.apache.lucene.index.TestCompoundFile
>
> Thanks for testing this!

Errr, TestCompoundFile turns out to be quite extensive, so I'll share that 
with Dmitry.

>
> > I'm testing the version with multiple buffers using a smaller maximum
> > buffer size (1024 * 128), and it does this test in the same way.
>
> You mean it fails too?

Yes, it behaves in the same way. It also passes the remaining tests
from TestCompoundFile for a smaller maximum buffer size.

> > I have not yet looked into TestCompoundFile. When it is a good test
> > case for this, I'll submit the multibuffer version as an enhancement.
>
> Thanks, that would be great.

A new version of MMapDirectory is on it's way with
a few comments on possible performance improvements.

Regards,
Paul Elschot



---------------------------------------------------------------------
To unsubscribe, e-mail: lucene-dev-unsubscribe@jakarta.apache.org
For additional commands, e-mail: lucene-dev-help@jakarta.apache.org


Re: Using MMapDirectory fails TestCompoundFile; MMapDirectory for huge indexes

Posted by Doug Cutting <cu...@apache.org>.
Paul Elschot wrote:
> I'm working on a memory mapped directory that uses multiple buffers
> for large files.

Great!

There will be a small performance hit, as each call to readByte() will 
need to first check whether it's overflowed the current buffer, right?

> While trying some test runs I found that the current version fails a test:
> 
>     [junit] Testsuite: org.apache.lucene.index.TestCompoundFile

Thanks for testing this!

> I'm testing the version with multiple buffers using a smaller maximum
> buffer size (1024 * 128), and it does this test in the same way.

You mean it fails too?

> I have not yet looked into TestCompoundFile. When it is a good test
> case for this, I'll submit the multibuffer version as an enhancement.

Thanks, that would be great.

Doug

---------------------------------------------------------------------
To unsubscribe, e-mail: lucene-dev-unsubscribe@jakarta.apache.org
For additional commands, e-mail: lucene-dev-help@jakarta.apache.org


Using MMapDirectory fails TestCompoundFile; MMapDirectory for huge indexes

Posted by Paul Elschot <pa...@xs4all.nl>.
On Wednesday 29 September 2004 00:00, Doug Cutting wrote:
> cutting@apache.org wrote:
> >   Added:       src/java/org/apache/lucene/store MMapDirectory.java
> >   Log:
> >   Add an nio mmap based Directory implementation.
>
> For my simple benchmarks this is somewhat slower than the classic
> FSDirectory, but I thought it was still worth having.  It should use
> less memory when there are lots of query terms, since it does not need
> to allocate a new buffer per term and the mmapped data can be shared.
> This may be good for folks who, e.g., use lots of wildcards.  It also
> should, in theory, someday be faster.  One downside is that it cannot
> handle indexes with files larger than 2^31 bytes.
>

I'm working on a memory mapped directory that uses multiple buffers
for large files.

While trying some test runs I found that the current version fails a test:

    [junit] Testsuite: org.apache.lucene.index.TestCompoundFile
    [junit] Tests run: 9, Failures: 1, Errors: 0, Time elapsed: 4.238 sec

    [junit] Testcase: testClonedStreamsClosing(org.apache.lucene.index.TestCompoundFile):	FAILED
    [junit] null
    [junit] junit.framework.AssertionFailedError
    [junit] 	at org.apache.lucene.index.TestCompoundFile.testClonedStreamsClosing(TestCompoundFile.java:368)
    [junit] 	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    [junit] 	at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
    [junit] 	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)

As indicated in the source I have added a System property to the jvm
to use MMapDirectory with this is line 261 in build.xml for the junit
tests:
      <jvmarg value="-Dorg.apache.lucene.FSDirectory.class=org.apache.lucene.store.MMapDirectory"/>

I'm testing the version with multiple buffers using a smaller maximum
buffer size (1024 * 128), and it does this test in the same way.
I have not yet looked into TestCompoundFile. When it is a good test
case for this, I'll submit the multibuffer version as an enhancement.

Regards,
Paul Elschot.



---------------------------------------------------------------------
To unsubscribe, e-mail: lucene-dev-unsubscribe@jakarta.apache.org
For additional commands, e-mail: lucene-dev-help@jakarta.apache.org


Re: cvs commit: jakarta-lucene/src/java/org/apache/lucene/store MMapDirectory.java

Posted by Doug Cutting <cu...@apache.org>.
cutting@apache.org wrote:
>   Added:       src/java/org/apache/lucene/store MMapDirectory.java
>   Log:
>   Add an nio mmap based Directory implementation.

For my simple benchmarks this is somewhat slower than the classic 
FSDirectory, but I thought it was still worth having.  It should use 
less memory when there are lots of query terms, since it does not need 
to allocate a new buffer per term and the mmapped data can be shared. 
This may be good for folks who, e.g., use lots of wildcards.  It also 
should, in theory, someday be faster.  One downside is that it cannot 
handle indexes with files larger than 2^31 bytes.

Doug

---------------------------------------------------------------------
To unsubscribe, e-mail: lucene-dev-unsubscribe@jakarta.apache.org
For additional commands, e-mail: lucene-dev-help@jakarta.apache.org