You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@trafficserver.apache.org by Chris Reynolds <sh...@gmail.com> on 2011/09/09 18:59:01 UTC

Problems with using transformation API with caching enabled

Hi,

I have written a plugin that scans data using transformations. It all works
fine until caching is enabled. I have found two problems and I have supplied
modified bnull_transform code to demonstrate them. The only changes I have
made to the code are: basic logging, adding hooks for READ_CACHE_HDR_HOOK
and reading the cache headers instead of the server headers
where appropriate.

The problems are:

1. It seems that when data is transformed from the cache that another
duplicate copy of it is written to the cache. To re-create this I use an
Apache server with the expires module enabled and then:
a. stop Traffic Server
b. clear the cache - traffic_server --Cclear
c. restart Traffic Server - running the modified bnull plugin
d. wget a 1MB file through the proxy - bnull logging shows that this comes
from the server
e. wait a while then use 'traffic_line -r proxy.process.cache_bytes_used' to
see that 1081344 bytes have been stored in the cache
f. perform exactly the same wget again - bnull logging shows that this comes
from the cache
g. use 'traffic_line -r proxy.process.cache_bytes_used' to see that 2162688
bytes have been stored in the cache - exactly double
h. keep using wget and 1081344 bytes keeps on being added to the cache

I have got around this in my own plugin by not writing to the OutputVConn
connection when the headers have come from the cache. I do not know whether
this is the best solution though.

2. This problem is a bit more complicated. It seems that sometimes if a
client sends a request with a 'if-modified-since' header to Traffic Server
with caching enabled but the response does not come from the cache then the
created transformation is not called so I cannot scan the data. The modified
bnull plugin shows this - in this case the cache header code path is never
used because the headers always come from the server. I have done some
network traces which I have attached which show the problem. To test this I
use a Eurogamer site URL where the response has been cached in the browser
causing it to add a 'if-modified-since' header to subsequent requests. The
URL I use is: http://www.eurogamer.net/styles/Autocomplete.css?v10.20-21670.

Normally this occurs (the 200sentback.cap shows this):
a. Go to the URL so that the response is stored in the browser cache.
b. Restart the browser.
c. Enter the URL again - browser sends request with 'if-modified-since'
header - this is shown in the capture.
d. The capture shows the request being sent to the server with
the  'if-modified-since' header stripped.
e. The server returns 200.
f. The proxy returns 200 and the transformation fires as normal - the
logging shows this:
20110909.17h38m12s transform_plugin - from server
20110909.17h38m12s transformable 1
20110909.17h38m12s bnull_transform 1
20110909.17h38m12s handle_transform
20110909.17h38m12s bnull_transform 1
20110909.17h38m12s handle_transform
20110909.17h38m12s bnull_transform 103
20110909.17h38m12s bnull_transform 1

When the problem occurs (the 304sentback.cap shows this):
a. Go to the URL so that the response is stored in the browser cache.
b. Restart the browser.
c. Enter the URL again - browser sends request with 'if-modified-since'
header - this is shown in the capture.
d. The capture shows the request being sent to the server with
the  'if-modified-since' header stripped.
e. The server returns 200.
f. The proxy returns 304 and closes the connection to the server. When this
occurs the transaction is closed down and the transformation is never
called. The logging shows this:
20110909.17h39m28s transform_plugin - from server
20110909.17h39m28s transformable 1
20110909.17h39m28s bnull_transform 1
As the logging shows 'handle_transform' is never called.

Note that it can take a number of attempts to re-create the problem. I used
Firefox and opened a new tab for each attempt.

Its odd that for 90% of the requests Traffic Server returns a 200 response
but for the rest (which are identical) it returns a 304. I guess that the
304 is causing the transformation not to fire because it thinks there is no
data. I cannot think of a workaround to get past this issue.

Can anyone shine any light on this?

Cheers,

Chris Reynolds.