You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@tika.apache.org by "Tim Allison (Jira)" <ji...@apache.org> on 2021/03/15 21:23:00 UTC

[jira] [Resolved] (TIKA-3313) Improve performance and usability of RereadableInputStream

     [ https://issues.apache.org/jira/browse/TIKA-3313?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Tim Allison resolved TIKA-3313.
-------------------------------
    Fix Version/s: 2.0.0
       Resolution: Fixed

Thank you [~peterkronenberg] !

> Improve performance and usability of RereadableInputStream
> ----------------------------------------------------------
>
>                 Key: TIKA-3313
>                 URL: https://issues.apache.org/jira/browse/TIKA-3313
>             Project: Tika
>          Issue Type: Improvement
>            Reporter: Peter Kronenberg
>            Priority: Major
>             Fix For: 2.0.0
>
>
> I was challenged by the following comment in RereadableInputStream:
> {code:java}
> // TODO: At some point it would be better to replace the current approach
>  // (specifying the above) with more automated behavior. The stream could
>  // keep the original stream open until EOF was reached. For example, if:
>  //
>  // the original stream is 10 bytes, and
>  // only 2 bytes are read on the first pass
>  // rewind() is called
>  // 5 bytes are read
>  //
>  // In this case, this instance gets the first 2 from its store,
>  // and the next 3 from the original stream, saving those additional 3
>  // bytes in the store. In this way, only the maximum number of bytes
>  // ever needed must be saved in the store; unused bytes are never read.
>  // The original stream is closed when EOF is reached, or when close()
>  // is called, whichever comes first. Using this approach eliminates
>  // the need to specify the flag (though makes implementation more complex).{code}
> Challenge accepted



--
This message was sent by Atlassian Jira
(v8.3.4#803005)