You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@lucy.apache.org by "Marvin Humphrey (JIRA)" <ji...@apache.org> on 2009/10/28 05:15:59 UTC

[jira] Created: (LUCY-63) InStream and OutStream

InStream and OutStream
----------------------

                 Key: LUCY-63
                 URL: https://issues.apache.org/jira/browse/LUCY-63
             Project: Lucy
          Issue Type: Sub-task
            Reporter: Marvin Humphrey
            Assignee: Marvin Humphrey




-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (LUCY-63) InStream and OutStream

Posted by "Marvin Humphrey (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/LUCY-63?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Marvin Humphrey updated LUCY-63:
--------------------------------

    Attachment: 055-io_chunks.t
                TestIOChunks.c
                TestIOChunks.bp

> InStream and OutStream
> ----------------------
>
>                 Key: LUCY-63
>                 URL: https://issues.apache.org/jira/browse/LUCY-63
>             Project: Lucy
>          Issue Type: Sub-task
>            Reporter: Marvin Humphrey
>            Assignee: Marvin Humphrey
>         Attachments: 052-instream.t, 054-io_primitives.t, 055-io_chunks.t, 101-simple_io.t, InStream.bp, InStream.c, InStream.pm, MockFileHandle.bp, MockFileHandle.c, OutStream.bp, OutStream.c, OutStream.pm, TestInStream.bp, TestInStream.c, TestIOChunks.bp, TestIOChunks.c, TestIOPrimitives.bp, TestIOPrimitives.c, TestUtils.bp, TestUtils.c
>
>


-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (LUCY-63) InStream and OutStream

Posted by "Marvin Humphrey (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/LUCY-63?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Marvin Humphrey updated LUCY-63:
--------------------------------

    Component/s: Core
       Priority: Blocker  (was: Major)

> InStream and OutStream
> ----------------------
>
>                 Key: LUCY-63
>                 URL: https://issues.apache.org/jira/browse/LUCY-63
>             Project: Lucy
>          Issue Type: Sub-task
>          Components: Core
>            Reporter: Marvin Humphrey
>            Assignee: Marvin Humphrey
>            Priority: Blocker
>         Attachments: 052-instream.t, 054-io_primitives.t, 055-io_chunks.t, 101-simple_io.t, InStream.bp, InStream.c, InStream.pm, MockFileHandle.bp, MockFileHandle.c, OutStream.bp, OutStream.c, OutStream.pm, TestInStream.bp, TestInStream.c, TestIOChunks.bp, TestIOChunks.c, TestIOPrimitives.bp, TestIOPrimitives.c, TestUtils.bp, TestUtils.c
>
>


-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (LUCY-63) InStream and OutStream

Posted by "Marvin Humphrey (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/LUCY-63?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Marvin Humphrey updated LUCY-63:
--------------------------------

    Attachment: InStream.pm
                InStream.c
                InStream.bp

InStream and OutStream are roughly analogous to Lucene's IndexInput and
IndexOutput classes, but there are some differences.

Under Lucy, FileHandle is where alternate "file" treatments are implemented:
RAMFileHandle, FSFileHandle.  InStream and OutStream are not final, but that's
so that it's possible to extend them with new methods.  In contrast, alternate
file treatments are achieved under Lucene by subclassing IndexInput and
IndexOutput directly.

Additionally, InStream and OutStream are always buffered.  This allows us to
inline some functionality that would otherwise have to be implemented in terms
of abstract methods like IndexInput.readByte() and IndexOutput.WriteByte().

>From Lucene's IndexInput.java (note readByte() in loop): 

{code:java}
public int readVInt() throws IOException {
  byte b = readByte();
  int i = b & 0x7F;
  for (int shift = 7; (b & 0x80) != 0; shift += 7) {
    b = readByte();
    i |= (b & 0x7F) << shift;
  }
  return i;
}
{code}

>From Lucy's InStream.c (note static inline function SI_read_u8() in loop):

{code:none}
u32_t 
InStream_read_c32 (InStream *self) 
{
    u32_t retval = 0;
    while (1) {
        const u8_t ubyte = SI_read_u8(self);
        retval = (retval << 7) | (ubyte & 0x7f);
        if ((ubyte & 0x80) == 0) { break; }
    }
    return retval;
}

static INLINE u8_t
SI_read_u8(InStream *self)
{
    if (self->buf >= self->limit) { S_refill(self); }
    return (u8_t)*self->buf++;
}
{code}

The fact that OutStream is buffered means an extra memory copy (Lucene has
this too).  Theoretically, it would be nice if we could write to the system
buffer directly, but that requires extending the file first -- see
[http://www.linuxquestions.org/questions/programming-9/mmap-tutorial-cc-511265/#post2549203].

The fact that InStream is buffered introduces no extra cost, because there is
no copy: for InStreams which wrap FSFileHandles, the buffer is sourced from a
memory-mapping operation (mmap for Unixen, MapViewOfFile under Windows).
Multiple InStream objects may share the same underlying FileHandle, since they
do not rely on or update the FileHandle's file position or other state
(excluding refcount). 

At present, no support is provided for systems which do not support memory
mapping.  Previous experiments included a fallback which read data into a
malloc'd buffer, and it would be possible to reintroduce that functionality if
we have to.  For now, though, it's simpler to leave it out.



> InStream and OutStream
> ----------------------
>
>                 Key: LUCY-63
>                 URL: https://issues.apache.org/jira/browse/LUCY-63
>             Project: Lucy
>          Issue Type: Sub-task
>            Reporter: Marvin Humphrey
>            Assignee: Marvin Humphrey
>         Attachments: InStream.bp, InStream.c, InStream.pm
>
>


-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (LUCY-63) InStream and OutStream

Posted by "Marvin Humphrey (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/LUCY-63?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Marvin Humphrey updated LUCY-63:
--------------------------------

    Attachment: TestUtils.c
                TestUtils.bp

> InStream and OutStream
> ----------------------
>
>                 Key: LUCY-63
>                 URL: https://issues.apache.org/jira/browse/LUCY-63
>             Project: Lucy
>          Issue Type: Sub-task
>            Reporter: Marvin Humphrey
>            Assignee: Marvin Humphrey
>         Attachments: 052-instream.t, InStream.bp, InStream.c, InStream.pm, MockFileHandle.bp, MockFileHandle.c, OutStream.bp, OutStream.c, OutStream.pm, TestInStream.bp, TestInStream.c, TestUtils.bp, TestUtils.c
>
>


-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (LUCY-63) InStream and OutStream

Posted by "Marvin Humphrey (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/LUCY-63?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Marvin Humphrey updated LUCY-63:
--------------------------------

    Attachment: OutStream.c
                OutStream.bp
                OutStream.pm

> InStream and OutStream
> ----------------------
>
>                 Key: LUCY-63
>                 URL: https://issues.apache.org/jira/browse/LUCY-63
>             Project: Lucy
>          Issue Type: Sub-task
>            Reporter: Marvin Humphrey
>            Assignee: Marvin Humphrey
>         Attachments: InStream.bp, InStream.c, InStream.pm, MockFileHandle.bp, MockFileHandle.c, OutStream.bp, OutStream.c, OutStream.pm
>
>


-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Resolved: (LUCY-63) InStream and OutStream

Posted by "Marvin Humphrey (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/LUCY-63?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Marvin Humphrey resolved LUCY-63.
---------------------------------

    Resolution: Fixed

Committed as r830909.

> InStream and OutStream
> ----------------------
>
>                 Key: LUCY-63
>                 URL: https://issues.apache.org/jira/browse/LUCY-63
>             Project: Lucy
>          Issue Type: Sub-task
>          Components: Core
>            Reporter: Marvin Humphrey
>            Assignee: Marvin Humphrey
>            Priority: Blocker
>         Attachments: 052-instream.t, 054-io_primitives.t, 055-io_chunks.t, 101-simple_io.t, InStream.bp, InStream.c, InStream.pm, MockFileHandle.bp, MockFileHandle.c, OutStream.bp, OutStream.c, OutStream.pm, TestInStream.bp, TestInStream.c, TestIOChunks.bp, TestIOChunks.c, TestIOPrimitives.bp, TestIOPrimitives.c, TestUtils.bp, TestUtils.c
>
>


-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (LUCY-63) InStream and OutStream

Posted by "Marvin Humphrey (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/LUCY-63?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Marvin Humphrey updated LUCY-63:
--------------------------------

    Attachment: TestIOPrimitives.c
                TestIOPrimitives.bp
                054-io_primitives.t

> InStream and OutStream
> ----------------------
>
>                 Key: LUCY-63
>                 URL: https://issues.apache.org/jira/browse/LUCY-63
>             Project: Lucy
>          Issue Type: Sub-task
>            Reporter: Marvin Humphrey
>            Assignee: Marvin Humphrey
>         Attachments: 052-instream.t, 054-io_primitives.t, 055-io_chunks.t, 101-simple_io.t, InStream.bp, InStream.c, InStream.pm, MockFileHandle.bp, MockFileHandle.c, OutStream.bp, OutStream.c, OutStream.pm, TestInStream.bp, TestInStream.c, TestIOChunks.bp, TestIOChunks.c, TestIOPrimitives.bp, TestIOPrimitives.c, TestUtils.bp, TestUtils.c
>
>


-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (LUCY-63) InStream and OutStream

Posted by "Marvin Humphrey (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/LUCY-63?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Marvin Humphrey updated LUCY-63:
--------------------------------

    Attachment: 101-simple_io.t

> InStream and OutStream
> ----------------------
>
>                 Key: LUCY-63
>                 URL: https://issues.apache.org/jira/browse/LUCY-63
>             Project: Lucy
>          Issue Type: Sub-task
>            Reporter: Marvin Humphrey
>            Assignee: Marvin Humphrey
>         Attachments: 052-instream.t, 054-io_primitives.t, 055-io_chunks.t, 101-simple_io.t, InStream.bp, InStream.c, InStream.pm, MockFileHandle.bp, MockFileHandle.c, OutStream.bp, OutStream.c, OutStream.pm, TestInStream.bp, TestInStream.c, TestIOChunks.bp, TestIOChunks.c, TestIOPrimitives.bp, TestIOPrimitives.c, TestUtils.bp, TestUtils.c
>
>


-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (LUCY-63) InStream and OutStream

Posted by "Marvin Humphrey (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/LUCY-63?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Marvin Humphrey updated LUCY-63:
--------------------------------

    Attachment: 052-instream.t
                TestInStream.c
                TestInStream.bp

> InStream and OutStream
> ----------------------
>
>                 Key: LUCY-63
>                 URL: https://issues.apache.org/jira/browse/LUCY-63
>             Project: Lucy
>          Issue Type: Sub-task
>            Reporter: Marvin Humphrey
>            Assignee: Marvin Humphrey
>         Attachments: 052-instream.t, InStream.bp, InStream.c, InStream.pm, MockFileHandle.bp, MockFileHandle.c, OutStream.bp, OutStream.c, OutStream.pm, TestInStream.bp, TestInStream.c, TestUtils.bp, TestUtils.c
>
>


-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (LUCY-63) InStream and OutStream

Posted by "Marvin Humphrey (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/LUCY-63?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Marvin Humphrey updated LUCY-63:
--------------------------------

    Attachment: MockFileHandle.c
                MockFileHandle.bp

> InStream and OutStream
> ----------------------
>
>                 Key: LUCY-63
>                 URL: https://issues.apache.org/jira/browse/LUCY-63
>             Project: Lucy
>          Issue Type: Sub-task
>            Reporter: Marvin Humphrey
>            Assignee: Marvin Humphrey
>         Attachments: InStream.bp, InStream.c, InStream.pm, MockFileHandle.bp, MockFileHandle.c, OutStream.bp, OutStream.c, OutStream.pm
>
>


-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.