You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@tika.apache.org by "K, Baraneetharan" <ba...@hp.com> on 2012/06/06 12:30:56 UTC
TikaInputStream customization
Can anyone pls let me know how to customize TikaInputStream to read only first 1000bytes from a given InputStream.
Regards,
Baranee
Re: TikaInputStream customization
Posted by Jukka Zitting <ju...@gmail.com>.
Hi,
On Wed, Jun 6, 2012 at 2:15 PM, Baranee <ba...@hp.com> wrote:
> Can u pls tell me how to use the beforeRead() method in TikaInputStream to
> set readlimit for reading bytes from a stream.
http://people.apache.org/~hossman/#xyproblem
Why do you want to use TikaInputStream like this?
BR,
Jukka Zitting
Re: TikaInputStream customization
Posted by Baranee <ba...@hp.com>.
Thanks Zukka for your reply.
Can u pls tell me how to use the beforeRead() method in TikaInputStream to
set readlimit for reading bytes from a stream.
Baranee
--
View this message in context: http://lucene.472066.n3.nabble.com/partial-file-parsing-tp3987724p3987956.html
Sent from the Apache Tika - Development mailing list archive at Nabble.com.
Re: TikaInputStream customization
Posted by Jukka Zitting <ju...@gmail.com>.
Hi,
On Wed, Jun 6, 2012 at 12:30 PM, K, Baraneetharan
<ba...@hp.com> wrote:
> Can anyone pls let me know how to customize TikaInputStream to read only first
> 1000bytes from a given InputStream.
You can use the BoundedInputStream [1] class from Commons IO:
TikaInputStream.get(new BoundedInputStream(stream, 1000));
However, see the concern in TIKA-307 [2]. Passing a truncated stream
to Tika may produce unexpected results.
[1] http://commons.apache.org/io/api-release/org/apache/commons/io/input/BoundedInputStream.html
[2] https://issues.apache.org/jira/browse/TIKA-307
BR,
Jukka Zitting