You are viewing a plain text version of this content. The canonical link for it is here.
Posted to mod_python-dev@quetz.apache.org by "Nicolas Lehuen (JIRA)" <ji...@apache.org> on 2005/12/10 10:18:08 UTC
[jira] Resolved: (MODPYTHON-97) mod_python.publisher iterables and content_type broken

     [ http://issues.apache.org/jira/browse/MODPYTHON-97?page=all ]
     
Nicolas Lehuen resolved MODPYTHON-97:
-------------------------------------

    Fix Version: 3.2
     Resolution: Fixed
      Assign To: Nicolas Lehuen

Fixed this by reverting the changes from MODPYTHON-15.

> mod_python.publisher iterables and content_type broken
> ------------------------------------------------------
>
>          Key: MODPYTHON-97
>          URL: http://issues.apache.org/jira/browse/MODPYTHON-97
>      Project: mod_python
>         Type: Bug
>   Components: publisher
>     Versions: 3.2
>     Reporter: Graham Dumpleton
>     Assignee: Nicolas Lehuen
>      Fix For: 3.2

>
> In 3.2, mod_python.publisher was modified so that if it encountered an interable it would recursively iterate over the items and publish each with the result being concatenated.
> FWIW, I personally didn't like this as I saw it potentially changing the behaviour of existing code, although perhaps in contrived cases or for test code only. I saw that this sort of behaviour should have been managed by the user by explicit use of a wrapper class instead, rather than it being magic. End of ramble. :-)
> Regardless of my concerns, the behaviour that was added is broken. Specifically, mod_python.publisher is setting the content type based on the content of the first item returned from the iterable. For example, consider the following:
> index = [
>   '<html><body><p>',
>   1000 * "X",
>   '</p></body></html>',
> ]
> When published, this is resulting in the content type being set to 'text/plain' and not 'text/html'. In part this is perhaps caused by the fact that the content type check is now performed by looking for a trailing '</html>' in the content whereas previously it would look for a leading '<html>'. This was changed because all the HTML prologue that can appear before '<html>' would often throw out this check with the HTML not being automatically being detected. Thus at the time it was thought that looking for the trailing '</html>' would be more reliable. It ain't going to help to go back to using a leading '<html>' check though as the first item may only contain the prologue and not '<html>'.
> These checks are only going to work for iterables if the results of publishing of each item were added to the end of a list of strings, rather than being written back immediately using req.write(). Once all that has been returned by the iterable is obtained, this can all be joined back together and then the HTML check done.
> Joining all the separate items returned from the iterable back together defeats the purpose of what this feature was about in the first place and may result in huge in memory objects needing to be created to hold the combined result just so the HTML check can be done.
> The only way to avoid the problem is for the content type to be set explicitly by the user before the iterable is processed. This is a bit tricky as it is mod_python.publisher which is automagically doing this. The best you can do is something like:
> class SetContentType:
>   def __init__(self,content_type):
>     self.__content_type = content_type
>   def __call__(self,req):
>     req.content_type = self.__content_type
>     return ""
> index = [
>   SetContentType('text/html'),
>   '<html><body><p>',
>   1000 * "X",
>   '</p></body></html>',
> ]
> Once you start doing this, the user may as well have provided their own published function in the first place that set the content type and manually iterated over items and wrote them to req.write(). This could also be managed by a user specified wrapper class which is how I saw this as preferably being done in the first place. Ie.,
> class PublishIterable:
>   def __init__(self,value,content_type):
>     self.__value = value
>     self.__content_type = content_type
>   def __call__(self,req):
>     req.content_type = self.__content_type
>     for item in self.__value:
>       req.write(item)
> _values = [
>   '<html><body><p>',
>   1000 * "X",
>   '</p></body></html>',
> ]
> index = PublishIterable(_values,'text/html')
> Personally I believe this automagic publishing of iterables should be removed from mod_python.publisher. You might still provide a special function/wrapper that works like PublisherIterable but handles recursive structures and callable objects in the process, but I feel it should be up to the user to make a conscious effort to use it and mod_python.publisher shouldn't assume that it should process any iterable in this way automatically.

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators:
   http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see:
   http://www.atlassian.com/software/jira