You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@tika.apache.org by Runomu <ce...@gmail.com> on 2014/11/12 19:22:31 UTC
Tika Api consumes given stream
I use Apache Tika bundle dependency for a Project to find out MimeTypes for
Files. due to some issues we have to find out through InputStream. it is
actually guaranteed to mark / reset given InputStream. Tika-Bundle includes
core and parser api and uses PoifscontainerDetector , ZipContainerDetector,
OggDetector, MimeTypes and Magic for detection. I have been debugging for 3
hours and all of Detectors mark and reset after detection. I did it in
following way.
TikaInputStream tis = null;
try {
TikaConfig config = new TikaConfig();
tikaDetector = config.getDetector();
tis = TikaInputStream.get(in);
MediaType mediaType = tikaDetector.detect(tis, new Metadata());
if (mediaType != null) {
String[] types = mediaType.toString().split(",");
for (int i = 0; i < types.length; i++) {
mimeTypes.add(new MimeType(types[i]));
}
}
} catch (Exception e) {
logger.error("Mime Type for given Stream could not be resolved: ",
e);
}
But Stream is consumed. Does anyone know how to find out MimeTypes without
consuming Stream?
--
View this message in context: http://lucene.472066.n3.nabble.com/Tika-Api-consumes-given-stream-tp4168960.html
Sent from the Apache Tika - Development mailing list archive at Nabble.com.
Re: Tika Api consumes given stream
Posted by Tyler Palsulich <tp...@gmail.com>.
Shot in the dark here, as I haven't tried this. But, have you tried using
mark/reset on the TikaInputStream? That should forward the requests on to
the underlying InputStream and hopefully work.
Tyler
On Wed, Nov 12, 2014 at 1:22 PM, Runomu <ce...@gmail.com> wrote:
> I use Apache Tika bundle dependency for a Project to find out MimeTypes for
> Files. due to some issues we have to find out through InputStream. it is
> actually guaranteed to mark / reset given InputStream. Tika-Bundle includes
> core and parser api and uses PoifscontainerDetector , ZipContainerDetector,
> OggDetector, MimeTypes and Magic for detection. I have been debugging for 3
> hours and all of Detectors mark and reset after detection. I did it in
> following way.
>
> TikaInputStream tis = null;
> try {
> TikaConfig config = new TikaConfig();
> tikaDetector = config.getDetector();
> tis = TikaInputStream.get(in);
> MediaType mediaType = tikaDetector.detect(tis, new Metadata());
>
> if (mediaType != null) {
> String[] types = mediaType.toString().split(",");
>
> for (int i = 0; i < types.length; i++) {
> mimeTypes.add(new MimeType(types[i]));
> }
> }
>
> } catch (Exception e) {
> logger.error("Mime Type for given Stream could not be resolved: ",
> e);
> }
>
> But Stream is consumed. Does anyone know how to find out MimeTypes without
> consuming Stream?
>
>
>
>
>
>
>
>
> --
> View this message in context:
> http://lucene.472066.n3.nabble.com/Tika-Api-consumes-given-stream-tp4168960.html
> Sent from the Apache Tika - Development mailing list archive at Nabble.com.
>