You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@tomcat.apache.org by Fred Toth <ft...@synernet.com> on 2008/09/14 19:24:18 UTC

Byte-serving PDFs unsupported? Or broken?

Hi all,

I've been trying to get to the bottom of an old question:

Does tomcat support byte-serving of PDF files? In searching the archives,
this comes up every few years or so and most responses are confusing
and inconclusive.

Here's more detail on the question, starting with my symptoms:

For unrelated reasons, I just switched a client's site from using apache 
to using tomcat 5.
I immediately starting hearing "PDFs are slower to download". I was able 
to confirm
this. For example, a 14mb PDF on my connection using tomcat takes about 
45 seconds before
the first page appears in the browser. On a generic apache install, I 
see the first page
in approximately 1 second.

The reason for this is that tomcat, out of the box, does not appear to 
support "byte serving",
or, possibly, it doesn't support it in a form that's acceptable to Adobe 
Acrobat
browser plug-ins.

To understand this one needs a bit of info on PDF internals:

It is possible to create PDFs that are "optimized for web view". PDFs in 
this form are
rearranged internally such that the first page can be delivered more 
quickly. (Earlier PDFs
kept all pointer information at the end of the file, which meant the 
entire file had to
be downloaded before the reader could find the first page.)

The apache web server supports these "optimized" PDFs with no particular 
configuration.
However, tomcat does not.

What I can't seem to find out is if this is just not supported? Or does 
it require some
specific tomcat configuration? Or does the Acrobat plug-in bend the 
specification
in a way that apache handles, but tomcat does not?

The underlying technology is based on particular HTTP headers. 
"Accept-Ranges" is used
by the server to say, "Yes, you can ask me for byte ranges of a file". 
The browser (or, in
this case, the Acrobat plug-in) responds with specific Content-Range 
requests, essentially treating
the PDF file as a random-access file.

I'm amazed that this doesn't come up more often, considering the 
prevalence of PDFs
(and the prevalence of tomcat!)

Also, here are some common discussion comments that are NOT the answer 
I'm seeking:

1. Yes, tomcat serves PDFs out of the box quite nicely.
2. Yes, one can use apache to serve PDFs instead of tomcat. This is not 
an option
in my case because I'm using tomcat to implement access control to those 
PDFs.
3. Yes, I know that one could write code to solve this, but I'm hoping 
that DefaultServlet
can do this. It's not trivial to implement.
4. I've seen comments that indicate that there is general support in 
tomcat for Accept-Ranges
and Content-Range. But I've also seen indications that Acrobat might 
require some specific
flow of headers before it does the right thing. I can confirm that 
neither tomcat 5 nor tomcat 6
handle this properly (at least with a generic configuration).
5. Yes, 14mb is quite large for a PDF and there are ways to make smaller 
PDFs. This particular
client has specific reasons for using such large PDFs.
6. This has nothing to do with generating PDFs on the fly. These are PDF 
files sitting
quietly in the web root.

Thanks for any advice you might have!

Fred Toth



---------------------------------------------------------------------
To start a new topic, e-mail: users@tomcat.apache.org
To unsubscribe, e-mail: users-unsubscribe@tomcat.apache.org
For additional commands, e-mail: users-help@tomcat.apache.org


Re: Byte-serving PDFs unsupported? Or broken?

Posted by Christopher Schultz <ch...@christopherschultz.net>.
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Fred,

Fred Toth wrote:
> I was in fact able to solve this with a simple filter, though this is not
> a general solution. In hopes that this helps someone else, the filter
> is included below.

i like what you're done here, but it's odd that the DefaultServlet does
not include this header by default -- regardless of the file type (you
have selected only PDF in your filter, which is probably a safer
implementation).

>                if (uri != null && uri.endsWith("pdf")) {
>                        res.setHeader("Accept-Ranges", "bytes");
>                }

Why not simply enable this filter only for *pdf requests using
<filter-mapping>? It will probably run faster and make your code simpler.

- -chris
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.9 (MingW32)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iEYEARECAAYFAkjQJ6kACgkQ9CaO5/Lv0PBg0gCfdZkbKMUyaRk7aej8+hkDVzRP
LCUAnR8H5sYdEdsGfJ00utSLnuEJmsgm
=HnHL
-----END PGP SIGNATURE-----

---------------------------------------------------------------------
To start a new topic, e-mail: users@tomcat.apache.org
To unsubscribe, e-mail: users-unsubscribe@tomcat.apache.org
For additional commands, e-mail: users-help@tomcat.apache.org


Re: Byte-serving PDFs unsupported? Or broken?

Posted by Fred Toth <ft...@synernet.com>.
Hi one more time,

I was in fact able to solve this with a simple filter, though this is not
a general solution. In hopes that this helps someone else, the filter
is included below.

Thanks,

Fred Toth

package com.toth.filter;

import javax.servlet.ServletRequest;
import javax.servlet.ServletResponse;
import javax.servlet.http.HttpServletRequest;
import javax.servlet.http.HttpServletResponse;
import javax.servlet.Filter;
import javax.servlet.FilterChain;
import java.io.IOException;
import javax.servlet.ServletException;
import javax.servlet.FilterConfig;

public class AcceptRangesFilter implements Filter {

        public void init(FilterConfig config) throws ServletException {

        }

        /*
         * This handles a long-standing tomcat bug (sort of). Tomcat 
supports the
         * underlying mechanism of byte serving of PDFs. However, the 
only way
         * to kick this mechanism into action is by telling the Acrobat 
plugin
         * that we accept byte range requests. The HTTP spec does not 
require this,
         * but Adobe does. This should really be fixed in DefaultServlet 
but this
         * gets the job done for now.
         *
         * See: https://issues.apache.org/bugzilla/show_bug.cgi?id=45419
         */
        public void doFilter(ServletRequest request, ServletResponse 
response,
                        FilterChain chain) throws IOException, 
ServletException {

                HttpServletRequest req = (HttpServletRequest) request;
                HttpServletResponse res = (HttpServletResponse) response;
           
                String uri = req.getRequestURI().toLowerCase();

                if (uri != null && uri.endsWith("pdf")) {
                        res.setHeader("Accept-Ranges", "bytes");
                }

                chain.doFilter(request, response);

        }

        public void destroy() {
        }

}
Fred Toth wrote:
> Hello again,
>
> After a bit more looking, I found this discussion in bugzilla from 
> this July:
>
> https://issues.apache.org/bugzilla/show_bug.cgi?id=45419
>
> In short, according to the discussion, correct interaction with the
> Adobe plugin requires that a server send the "Accept-Ranges" header
> in response to the initial request. The Adobe plugin sees this and then
> proceeds to read the PDF file in chunks instead of requesting the
> whole thing.
>
> Currently, the tomcat DefaultServlet does not set this header, even 
> though
> it supports the Content-Range requests.
>
> It appears that I can work around this with a filter. I'll report back 
> on that
> after I've got one going.
>
> Can anyone comment on the status of this bug?
>
> Many thanks,
>
> Fred Toth
>
> Fred Toth wrote:
>> Hi all,
>>
>> I've been trying to get to the bottom of an old question:
>>
>> Does tomcat support byte-serving of PDF files? In searching the 
>> archives,
>> this comes up every few years or so and most responses are confusing
>> and inconclusive.
>>
>> Here's more detail on the question, starting with my symptoms:
>>
>> For unrelated reasons, I just switched a client's site from using 
>> apache to using tomcat 5.
>> I immediately starting hearing "PDFs are slower to download". I was 
>> able to confirm
>> this. For example, a 14mb PDF on my connection using tomcat takes 
>> about 45 seconds before
>> the first page appears in the browser. On a generic apache install, I 
>> see the first page
>> in approximately 1 second.
>>
>> The reason for this is that tomcat, out of the box, does not appear 
>> to support "byte serving",
>> or, possibly, it doesn't support it in a form that's acceptable to 
>> Adobe Acrobat
>> browser plug-ins.
>>
>> To understand this one needs a bit of info on PDF internals:
>>
>> It is possible to create PDFs that are "optimized for web view". PDFs 
>> in this form are
>> rearranged internally such that the first page can be delivered more 
>> quickly. (Earlier PDFs
>> kept all pointer information at the end of the file, which meant the 
>> entire file had to
>> be downloaded before the reader could find the first page.)
>>
>> The apache web server supports these "optimized" PDFs with no 
>> particular configuration.
>> However, tomcat does not.
>>
>> What I can't seem to find out is if this is just not supported? Or 
>> does it require some
>> specific tomcat configuration? Or does the Acrobat plug-in bend the 
>> specification
>> in a way that apache handles, but tomcat does not?
>>
>> The underlying technology is based on particular HTTP headers. 
>> "Accept-Ranges" is used
>> by the server to say, "Yes, you can ask me for byte ranges of a 
>> file". The browser (or, in
>> this case, the Acrobat plug-in) responds with specific Content-Range 
>> requests, essentially treating
>> the PDF file as a random-access file.
>>
>> I'm amazed that this doesn't come up more often, considering the 
>> prevalence of PDFs
>> (and the prevalence of tomcat!)
>>
>> Also, here are some common discussion comments that are NOT the 
>> answer I'm seeking:
>>
>> 1. Yes, tomcat serves PDFs out of the box quite nicely.
>> 2. Yes, one can use apache to serve PDFs instead of tomcat. This is 
>> not an option
>> in my case because I'm using tomcat to implement access control to 
>> those PDFs.
>> 3. Yes, I know that one could write code to solve this, but I'm 
>> hoping that DefaultServlet
>> can do this. It's not trivial to implement.
>> 4. I've seen comments that indicate that there is general support in 
>> tomcat for Accept-Ranges
>> and Content-Range. But I've also seen indications that Acrobat might 
>> require some specific
>> flow of headers before it does the right thing. I can confirm that 
>> neither tomcat 5 nor tomcat 6
>> handle this properly (at least with a generic configuration).
>> 5. Yes, 14mb is quite large for a PDF and there are ways to make 
>> smaller PDFs. This particular
>> client has specific reasons for using such large PDFs.
>> 6. This has nothing to do with generating PDFs on the fly. These are 
>> PDF files sitting
>> quietly in the web root.
>>
>> Thanks for any advice you might have!
>>
>> Fred Toth
>>
>>
>>
>> ---------------------------------------------------------------------
>> To start a new topic, e-mail: users@tomcat.apache.org
>> To unsubscribe, e-mail: users-unsubscribe@tomcat.apache.org
>> For additional commands, e-mail: users-help@tomcat.apache.org
>>
>>
>
> ---------------------------------------------------------------------
> To start a new topic, e-mail: users@tomcat.apache.org
> To unsubscribe, e-mail: users-unsubscribe@tomcat.apache.org
> For additional commands, e-mail: users-help@tomcat.apache.org
>
>

---------------------------------------------------------------------
To start a new topic, e-mail: users@tomcat.apache.org
To unsubscribe, e-mail: users-unsubscribe@tomcat.apache.org
For additional commands, e-mail: users-help@tomcat.apache.org


Re: Byte-serving PDFs unsupported? Or broken?

Posted by Fred Toth <ft...@synernet.com>.
Hello again,

After a bit more looking, I found this discussion in bugzilla from this 
July:

https://issues.apache.org/bugzilla/show_bug.cgi?id=45419

In short, according to the discussion, correct interaction with the
Adobe plugin requires that a server send the "Accept-Ranges" header
in response to the initial request. The Adobe plugin sees this and then
proceeds to read the PDF file in chunks instead of requesting the
whole thing.

Currently, the tomcat DefaultServlet does not set this header, even though
it supports the Content-Range requests.

It appears that I can work around this with a filter. I'll report back 
on that
after I've got one going.

Can anyone comment on the status of this bug?

Many thanks,

Fred Toth

Fred Toth wrote:
> Hi all,
>
> I've been trying to get to the bottom of an old question:
>
> Does tomcat support byte-serving of PDF files? In searching the archives,
> this comes up every few years or so and most responses are confusing
> and inconclusive.
>
> Here's more detail on the question, starting with my symptoms:
>
> For unrelated reasons, I just switched a client's site from using 
> apache to using tomcat 5.
> I immediately starting hearing "PDFs are slower to download". I was 
> able to confirm
> this. For example, a 14mb PDF on my connection using tomcat takes 
> about 45 seconds before
> the first page appears in the browser. On a generic apache install, I 
> see the first page
> in approximately 1 second.
>
> The reason for this is that tomcat, out of the box, does not appear to 
> support "byte serving",
> or, possibly, it doesn't support it in a form that's acceptable to 
> Adobe Acrobat
> browser plug-ins.
>
> To understand this one needs a bit of info on PDF internals:
>
> It is possible to create PDFs that are "optimized for web view". PDFs 
> in this form are
> rearranged internally such that the first page can be delivered more 
> quickly. (Earlier PDFs
> kept all pointer information at the end of the file, which meant the 
> entire file had to
> be downloaded before the reader could find the first page.)
>
> The apache web server supports these "optimized" PDFs with no 
> particular configuration.
> However, tomcat does not.
>
> What I can't seem to find out is if this is just not supported? Or 
> does it require some
> specific tomcat configuration? Or does the Acrobat plug-in bend the 
> specification
> in a way that apache handles, but tomcat does not?
>
> The underlying technology is based on particular HTTP headers. 
> "Accept-Ranges" is used
> by the server to say, "Yes, you can ask me for byte ranges of a file". 
> The browser (or, in
> this case, the Acrobat plug-in) responds with specific Content-Range 
> requests, essentially treating
> the PDF file as a random-access file.
>
> I'm amazed that this doesn't come up more often, considering the 
> prevalence of PDFs
> (and the prevalence of tomcat!)
>
> Also, here are some common discussion comments that are NOT the answer 
> I'm seeking:
>
> 1. Yes, tomcat serves PDFs out of the box quite nicely.
> 2. Yes, one can use apache to serve PDFs instead of tomcat. This is 
> not an option
> in my case because I'm using tomcat to implement access control to 
> those PDFs.
> 3. Yes, I know that one could write code to solve this, but I'm hoping 
> that DefaultServlet
> can do this. It's not trivial to implement.
> 4. I've seen comments that indicate that there is general support in 
> tomcat for Accept-Ranges
> and Content-Range. But I've also seen indications that Acrobat might 
> require some specific
> flow of headers before it does the right thing. I can confirm that 
> neither tomcat 5 nor tomcat 6
> handle this properly (at least with a generic configuration).
> 5. Yes, 14mb is quite large for a PDF and there are ways to make 
> smaller PDFs. This particular
> client has specific reasons for using such large PDFs.
> 6. This has nothing to do with generating PDFs on the fly. These are 
> PDF files sitting
> quietly in the web root.
>
> Thanks for any advice you might have!
>
> Fred Toth
>
>
>
> ---------------------------------------------------------------------
> To start a new topic, e-mail: users@tomcat.apache.org
> To unsubscribe, e-mail: users-unsubscribe@tomcat.apache.org
> For additional commands, e-mail: users-help@tomcat.apache.org
>
>

---------------------------------------------------------------------
To start a new topic, e-mail: users@tomcat.apache.org
To unsubscribe, e-mail: users-unsubscribe@tomcat.apache.org
For additional commands, e-mail: users-help@tomcat.apache.org


RE: Byte-serving PDFs unsupported? Or broken?

Posted by Martin Gainty <mg...@hotmail.com>.
the servlet alternative is to use FOP (formatting object processor) to produce Acrobat 7.0 documents<BR>

the best best is to download the source<BR>
$FOP_HOME\src>svn co https://svn.apache.org/repos/asf/xmlgraphics/fop/bran
ches/fop-0_95 src<BR>
build the source , create the war<BR>
>ant<BR>
use TC manager to deploy the war<BR>
$FOP_HOME\src\build\fop.war<BR>
and then run the fop in browser using the input readme.fo (fo xml document)<BR>
http://localhost:8080/fop/servlet/FopServlet?fo=readme.fo&pdf=readme.pdf&ext=.pdf<BR>
if you have properly downloaded and installed the Adobe Acrobat 7 PDF viewer plugin you should be able to view the PDF<BR>
<BR>
the longer range scenario should look at viewing and manipulating MTOM optimised binaries with Axis
http://ws.apache.org/axis2/1_0/mtom-guide.html<BR>

FWIW<BR>
Martin <BR>
______________________________________________ <BR>
Disclaimer and confidentiality note 
Everything in this e-mail and any attachments relates to the official business of Sender. This transmission is of a confidential nature and Sender does not endorse distribution to any party other than intended recipient. Sender does not necessarily endorse content contained within this transmission. 


> Date: Sun, 14 Sep 2008 13:24:18 -0400
> From: ftoth@synernet.com
> To: users@tomcat.apache.org
> Subject: Byte-serving PDFs unsupported? Or broken?
> 
> Hi all,
> 
> I've been trying to get to the bottom of an old question:
> 
> Does tomcat support byte-serving of PDF files? In searching the archives,
> this comes up every few years or so and most responses are confusing
> and inconclusive.
> 
> Here's more detail on the question, starting with my symptoms:
> 
> For unrelated reasons, I just switched a client's site from using apache 
> to using tomcat 5.
> I immediately starting hearing "PDFs are slower to download". I was able 
> to confirm
> this. For example, a 14mb PDF on my connection using tomcat takes about 
> 45 seconds before
> the first page appears in the browser. On a generic apache install, I 
> see the first page
> in approximately 1 second.
> 
> The reason for this is that tomcat, out of the box, does not appear to 
> support "byte serving",
> or, possibly, it doesn't support it in a form that's acceptable to Adobe 
> Acrobat
> browser plug-ins.
> 
> To understand this one needs a bit of info on PDF internals:
> 
> It is possible to create PDFs that are "optimized for web view". PDFs in 
> this form are
> rearranged internally such that the first page can be delivered more 
> quickly. (Earlier PDFs
> kept all pointer information at the end of the file, which meant the 
> entire file had to
> be downloaded before the reader could find the first page.)
> 
> The apache web server supports these "optimized" PDFs with no particular 
> configuration.
> However, tomcat does not.
> 
> What I can't seem to find out is if this is just not supported? Or does 
> it require some
> specific tomcat configuration? Or does the Acrobat plug-in bend the 
> specification
> in a way that apache handles, but tomcat does not?
> 
> The underlying technology is based on particular HTTP headers. 
> "Accept-Ranges" is used
> by the server to say, "Yes, you can ask me for byte ranges of a file". 
> The browser (or, in
> this case, the Acrobat plug-in) responds with specific Content-Range 
> requests, essentially treating
> the PDF file as a random-access file.
> 
> I'm amazed that this doesn't come up more often, considering the 
> prevalence of PDFs
> (and the prevalence of tomcat!)
> 
> Also, here are some common discussion comments that are NOT the answer 
> I'm seeking:
> 
> 1. Yes, tomcat serves PDFs out of the box quite nicely.
> 2. Yes, one can use apache to serve PDFs instead of tomcat. This is not 
> an option
> in my case because I'm using tomcat to implement access control to those 
> PDFs.
> 3. Yes, I know that one could write code to solve this, but I'm hoping 
> that DefaultServlet
> can do this. It's not trivial to implement.
> 4. I've seen comments that indicate that there is general support in 
> tomcat for Accept-Ranges
> and Content-Range. But I've also seen indications that Acrobat might 
> require some specific
> flow of headers before it does the right thing. I can confirm that 
> neither tomcat 5 nor tomcat 6
> handle this properly (at least with a generic configuration).
> 5. Yes, 14mb is quite large for a PDF and there are ways to make smaller 
> PDFs. This particular
> client has specific reasons for using such large PDFs.
> 6. This has nothing to do with generating PDFs on the fly. These are PDF 
> files sitting
> quietly in the web root.
> 
> Thanks for any advice you might have!
> 
> Fred Toth
> 
> 
> 
> ---------------------------------------------------------------------
> To start a new topic, e-mail: users@tomcat.apache.org
> To unsubscribe, e-mail: users-unsubscribe@tomcat.apache.org
> For additional commands, e-mail: users-help@tomcat.apache.org
> 

_________________________________________________________________
See how Windows connects the people, information, and fun that are part of your life.
http://clk.atdmt.com/MRT/go/msnnkwxp1020093175mrt/direct/01/