You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@poi.apache.org by Thiyagarajan <th...@gmail.com> on 2017/03/22 17:50:44 UTC

Apache POI - Detecting difference between an xlsx file and a normal zip file

Hi,
 I have an InputStream wrapped in a BufferedInputStream and I'm trying to
detect whether it is a normal zip file or a xlsx file (and take appropriate
actions accordingly). I have tried to use  hasOOXMLHeader
<https://github.com/apache/poi/blob/trunk/src/java/org/apache/poi/poifs/filesystem/DocumentFactoryHelper.java#L91>  
to achieve this. But it just checks if the input stream is a zip file and
there is nothing specific for an xlsx file there. (I understand that xlsx is
a zip file with a bunch of xml files).

Is it possible to detect if the inputstream is from xlsx file or a normal
zip file?

Regards,
B.Thiyagarajan



--
View this message in context: http://apache-poi.1045710.n5.nabble.com/Apache-POI-Detecting-difference-between-an-xlsx-file-and-a-normal-zip-file-tp5727035.html
Sent from the POI - User mailing list archive at Nabble.com.

---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscribe@poi.apache.org
For additional commands, e-mail: user-help@poi.apache.org


Re: Apache POI - Detecting difference between an xlsx file and a normal zip file

Posted by Javen O'Neal <ja...@gmail.com>.
You could also use Apache Tika to detect the file type.

On Mar 22, 2017 11:15, "Thiyagarajan" <th...@gmail.com> wrote:

> Hi,
> Say I have the actual File object of the xlsx file, what do you mean
> exactly
> by random-access do you mean?
>
>
>
> --
> View this message in context: http://apache-poi.1045710.n5.
> nabble.com/Apache-POI-Detecting-difference-between-
> an-xlsx-file-and-a-normal-zip-file-tp5727035p5727037.html
> Sent from the POI - User mailing list archive at Nabble.com.
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: user-unsubscribe@poi.apache.org
> For additional commands, e-mail: user-help@poi.apache.org
>
>

Re: Apache POI - Detecting difference between an xlsx file and a normal zip file

Posted by Thiyagarajan <th...@gmail.com>.
Hi,
Say I have the actual File object of the xlsx file, what do you mean exactly
by random-access do you mean?



--
View this message in context: http://apache-poi.1045710.n5.nabble.com/Apache-POI-Detecting-difference-between-an-xlsx-file-and-a-normal-zip-file-tp5727035p5727037.html
Sent from the POI - User mailing list archive at Nabble.com.

---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscribe@poi.apache.org
For additional commands, e-mail: user-help@poi.apache.org


Re: Apache POI - Detecting difference between an xlsx file and a normal zip file

Posted by Nick Burch <ap...@gagravarr.org>.
On Wed, 22 Mar 2017, Thiyagarajan wrote:
> I have an InputStream wrapped in a BufferedInputStream and I'm trying to
> detect whether it is a normal zip file or a xlsx file (and take appropriate
> actions accordingly). I have tried to use  hasOOXMLHeader
> <https://github.com/apache/poi/blob/trunk/src/java/org/apache/poi/poifs/filesystem/DocumentFactoryHelper.java#L91>
> to achieve this. But it just checks if the input stream is a zip file and
> there is nothing specific for an xlsx file there. (I understand that xlsx is
> a zip file with a bunch of xml files).
>
> Is it possible to detect if the inputstream is from xlsx file or a normal
> zip file?

Not easily from an InputStream. You'd need to check if there's a 
[Content_Types].xml file in the zip to have a good idea, and that may not 
be the first file in the zip. So, you'll need to buffer the zip stream 
into memory, and parse through the entries to see if there's a content 
types, and rewind back to process if so. Much easier with a File, as you 
can easily do random-access to check without buffering

Nick

---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscribe@poi.apache.org
For additional commands, e-mail: user-help@poi.apache.org