You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@tika.apache.org by "ASF GitHub Bot (Jira)" <ji...@apache.org> on 2021/03/04 19:49:00 UTC

[jira] [Commented] (TIKA-3313) Improve performance and usability of RereadableInputStream

    [ https://issues.apache.org/jira/browse/TIKA-3313?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17295552#comment-17295552 ] 

ASF GitHub Bot commented on TIKA-3313:
--------------------------------------

peterkronenberg opened a new pull request #411:
URL: https://github.com/apache/tika/pull/411


   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


> Improve performance and usability of RereadableInputStream
> ----------------------------------------------------------
>
>                 Key: TIKA-3313
>                 URL: https://issues.apache.org/jira/browse/TIKA-3313
>             Project: Tika
>          Issue Type: Improvement
>            Reporter: Peter Kronenberg
>            Priority: Major
>
> I was challenged by the following comment:
> {code:java}
> // TODO: At some point it would be better to replace the current approach
>  // (specifying the above) with more automated behavior. The stream could
>  // keep the original stream open until EOF was reached. For example, if:
>  //
>  // the original stream is 10 bytes, and
>  // only 2 bytes are read on the first pass
>  // rewind() is called
>  // 5 bytes are read
>  //
>  // In this case, this instance gets the first 2 from its store,
>  // and the next 3 from the original stream, saving those additional 3
>  // bytes in the store. In this way, only the maximum number of bytes
>  // ever needed must be saved in the store; unused bytes are never read.
>  // The original stream is closed when EOF is reached, or when close()
>  // is called, whichever comes first. Using this approach eliminates
>  // the need to specify the flag (though makes implementation more complex).{code}
> Challenge accepted



--
This message was sent by Atlassian Jira
(v8.3.4#803005)