You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@httpd.apache.org by Yann Ylavic <yl...@gmail.com> on 2016/09/09 23:13:20 UTC

Re: svn commit: r1757982 - /httpd/httpd/branches/2.4.x/STATUS

On Sat, Aug 27, 2016 at 12:12 PM,  <el...@apache.org> wrote:
> Author: elukey
> Date: Sat Aug 27 10:12:23 2016
> New Revision: 1757982
>
> URL: http://svn.apache.org/viewvc?rev=1757982&view=rev
> Log:
> Updated backport proposal with the latest discussion on dev@
> about Last-Modified header handling.
> I removed jchampion's vote since it was related to a completely
> different solution.
>
>
> Modified:
>     httpd/httpd/branches/2.4.x/STATUS
>
[]
>
>    *) core: Drop an invalid Last-Modified header value coming
> -     from a FCGI/CGI script instead of replacing it with Unix epoch.
> -     Warn the users about Last-Modified header value replacements and
> -     improved handling of non-GMT datestr.
> +     from a (F)CGI script instead of replacing it with Unix epoch.
> +     Warn the users about Last-Modified header value replacements
> +     and violations of the RFC.
>       trunk patch: http://svn.apache.org/r1748379
>                    http://svn.apache.org/r1750747
>                    http://svn.apache.org/r1750749
> @@ -139,11 +139,13 @@ PATCHES PROPOSED TO BACKPORT FROM TRUNK:
>                    http://svn.apache.org/r1751138
>                    http://svn.apache.org/r1751139
>                    http://svn.apache.org/r1751147
> -     2.4.x: trunk patches works (final view http://apaste.info/9v3)
> -     The last revision has been discussed in dev@ and submitted by Yann.
> +                  http://svn.apache.org/r1757818
> +     2.4.x: trunk patches works (final view http://apaste.info/FCs)
> +     The last revision has been discussed extensively in dev@ and this seems to be
> +     a good compromise for the moment.
>       Tested the code with a simple PHP script returning different Last-Modified
> -     headers (GMT now, GMT now Europe/Paris, GMT tomorrow, GMT yesterday).
> -     +1: elukey, jchampion
> +     headers (GMT now, GMT now Europe/Paris, GMT tomorrow, GMT yesterday, PST now).
> +     +1: elukey

As read it, this proposal (http://apaste.info/FCs) does not enforce
GMT date and will treat any other timezone as if it were GMT (per
apr_date_parse_http(), which does not look for "GMT" anywhere), right?

Shouldn't we either ignore (i.e. not forward) non-GMT, or change it to
GMT if it's a valid timezone?

The only way to do this would be to use apr_date_parse_rfc() in the first place.
If we'd want to ignore non-GMT, compare it to apr_date_parse_http().
And only otherwise or if it matches, let it go through ap_update_mtime().

I don't the difference between this proposal and the current code (but
TRACEs), ap_update_mtime() is a noop with APR_DATE_BAD.
Did I miss something?

Regards,
Yann.

Re: svn commit: r1757982 - /httpd/httpd/branches/2.4.x/STATUS

Posted by Luca Toscano <to...@gmail.com>.
Hi Yann,

2016-09-10 13:58 GMT+02:00 Yann Ylavic <yl...@gmail.com>:

> Hi Luca,
>
> On Sat, Sep 10, 2016 at 12:07 PM, Luca Toscano <to...@gmail.com>
> wrote:
> >
> > 2016-09-10 1:13 GMT+02:00 Yann Ylavic <yl...@gmail.com>:
> >>
> >> As read it, this proposal (http://apaste.info/FCs) does not enforce
> >> GMT date and will treat any other timezone as if it were GMT (per
> >> apr_date_parse_http(), which does not look for "GMT" anywhere), right?
> >
> > Yes, it leaves the current behavior unaltered (except for APR_DATE_BAD).
>
> So we still may change the Last-Modified sent by a CGI...
>

Yes, all the drawbacks of the current approach (basically assuming 'GMT' in
the datestring returned by a CGI) are still there, but they are logged and
users know what httpd is doing.


> >>
> >> Shouldn't we either ignore (i.e. not forward) non-GMT, or change it to
> >> GMT if it's a valid timezone?
> >>
> >> The only way to do this would be to use apr_date_parse_rfc() in the
> first
> >> place.
> >> If we'd want to ignore non-GMT, compare it to apr_date_parse_http().
> >> And only otherwise or if it matches, let it go through
> ap_update_mtime().
> >
> > My decision to not use apr_date_parse_rfc came after the long discussion
> in
> > dev@ about whether or not a Last-Modified header coming from a CGI
> backend
> > should be considered HTTP input (so assuming GMT IIUC) or not. Since we
> > didn't reach a complete agreement on how to proceed I thought to propose
> a
> > light change as starter (to answer a question raised in users@ a while
> back
> > about warning admins on httpd modifications of the Last-Modified header
> > value) and think about more invasive changes in the future, like
> dropping a
> > non GMT header or converting it in GMT if on a valid timezone.
>
> If we really didn't want to modify the (semantic of the )header, we'd
> accept other timezones and convert it to GMT.
> But dropping or interpreting it as GMT when it's not (i.e. s/<any
> timezone>/GMT/ without changing the date) is a modification anyway.
>
> So if we want to stick to the RFC (and it's very strict
> interpretation, which I'm not sure it's that strict), either we ignore
> or we error on non-GMT (the latter would be an unacceptable change in
> behaviour IMHO).
>

I wouldn't like both since they could be too invasive for existing
web-apps/CGIs that are relying on this "feature" without knowing it. I am
in favor of either interpreting a valid non-GMT datestring or leaving the
current behavior and log a warning.


> >> I don't the difference between this proposal and the current code (but
> >> TRACEs), ap_update_mtime() is a noop with APR_DATE_BAD.
> >> Did I miss something?
> >
> >
> > This is a very good point, I didn't notice an important part. From what I
> > can see ap_update_mtime is indeed not updating anything, but r->mtime is
> > already 0 and the ap_set_last_modified picks it up and use it (ending up
> > with a Unix epoch datetime string). My check on APR_DATE_BAD avoids the
> call
> > to ap_set_last_modified, this is why the header is dropped.
>
> Yes, I missed ap_set_last_modified() was using ~now when given zero.
>
> > Maybe we also
> > need something like the following?
> >
> >  AP_DECLARE(void) ap_set_last_modified(request_rec *r)
> >  {
> > -    if (!r->assbackwards) {
> > +    if (!r->assbackwards && r->mtime > 0) {
> >          apr_time_t mod_time = ap_rationalize_mtime(r, r->mtime);
> >          char *datestr = apr_palloc(r->pool, APR_RFC822_DATE_LEN);
>
> That'd change its behaviour, one may currently use it to set
> Last-Modified to now, w/o calling ap_update_mtime(), and it used to
> work...
>

My bad, didn't investigate the solution properly :)


> Anyway, my personal preference would be to tolerate a different
> timezone (than GMT), or at at worse ignore the header (if not GMT).
>
> So I think the patch (against 2.4.x) would be something like :
>
> Index: server/util_script.c
> ===================================================================
> --- server/util_script.c    (revision 1760114)
> +++ server/util_script.c    (working copy)
> @@ -670,11 +670,26 @@ AP_DECLARE(int) ap_scan_script_header_err_core_ex(
>          }
>          /*
>           * If the script gave us a Last-Modified header, we can't just
> -         * pass it on blindly because of restrictions on future values.
> +         * pass it on blindly (RFC mandates GMT).  Future values are
> +         * handled by ap_set_last_modified() and changed to now.
>           */
>          else if (!strcasecmp(w, "Last-Modified")) {
> -            ap_update_mtime(r, apr_date_parse_http(l));
> -            ap_set_last_modified(r);
> +            apr_time_t parsed_date = apr_date_parse_rfc(l);
> +            if (parsed_date == APR_DATE_BAD) {
> +               ap_log_rerror(SCRIPT_LOG_MARK, APLOG_DEBUG, 0, r,
> +                             "Ignored Last-Modified header value: '%s'",
> +                             l);
> +            }
> +            else if(parsed_date != apr_date_parse_http(l)) {
> +                ap_log_rerror(SCRIPT_LOG_MARK, APLOG_DEBUG, 0, r,
> +                              "Ignored Last-Modified header value (not "
> +                              "within the GMT timezone, as required):
> '%s'",
> +                              l);
> +            }
>

The patch looks good but the above bit is what I'd be worried about, since
I am afraid that it could affect existing web-app/CGIs needing extra
tuning/config after the httpd upgrade.

Imagine an old and untouchable CGI returning an 'incorrect' Last-Modified
header, worked fine till now but broken after this change. Maybe I am too
paranoid but the admins in the community have already a painful life while
upgrading systems, I wouldn't cause more headaches to them if possible :)


> +            else {
> +                ap_update_mtime(r, parsed_date);
> +                ap_set_last_modified(r);
> +            }
>          }
>          else if (!strcasecmp(w, "Set-Cookie")) {
>              apr_table_add(cookie_table, w, l);
> _
>
>
> Or the lenient mode :
>
> Index: server/util_script.c
> ===================================================================
> --- server/util_script.c    (revision 1760114)
> +++ server/util_script.c    (working copy)
> @@ -670,11 +670,21 @@ AP_DECLARE(int) ap_scan_script_header_err_core_ex(
>          }
>          /*
>           * If the script gave us a Last-Modified header, we can't just
> -         * pass it on blindly because of restrictions on future values.
> +         * pass it on blindly (RFC mandates GMT, but be lenient and
> +         * transform other timezones to GMT).  Future values are handled
> +         * by ap_set_last_modified() and changed to now.
>           */
>          else if (!strcasecmp(w, "Last-Modified")) {
> -            ap_update_mtime(r, apr_date_parse_http(l));
> -            ap_set_last_modified(r);
> +            apr_time_t parsed_date = apr_date_parse_rfc(l);
> +            if (parsed_date == APR_DATE_BAD) {
> +               ap_log_rerror(SCRIPT_LOG_MARK, APLOG_DEBUG, 0, r,
> +                             "Ignored Last-Modified header value: '%s'",
> +                             l);
> +            }
> +            else {
> +                ap_update_mtime(r, parsed_date);
> +                ap_set_last_modified(r);
> +            }
>          }
>          else if (!strcasecmp(w, "Set-Cookie")) {
>              apr_table_add(cookie_table, w, l);
>

I really like this one, but I'd also add a trace log to warn the admin
if apr_date_parse_http(l) != apr_date_parse_rfc(l).

My last code change proposal was an attempt to satisfy a user request and a
dev@ discussion, but probably I failed to find a good compromise.

I am personally in favor of using apr_date_parse_rfc instead of the _http
counterpart, but I'd also like to see an agreement in this list. My +1 goes
to your second solution (plus some trace logging if everybody likes it).

William/Jacob: if you want to add more thoughts please do :)

Thanks a lot for the review!

Luca

Re: svn commit: r1757982 - /httpd/httpd/branches/2.4.x/STATUS

Posted by Yann Ylavic <yl...@gmail.com>.
Hi Luca,

On Sat, Sep 10, 2016 at 12:07 PM, Luca Toscano <to...@gmail.com> wrote:
>
> 2016-09-10 1:13 GMT+02:00 Yann Ylavic <yl...@gmail.com>:
>>
>> As read it, this proposal (http://apaste.info/FCs) does not enforce
>> GMT date and will treat any other timezone as if it were GMT (per
>> apr_date_parse_http(), which does not look for "GMT" anywhere), right?
>
> Yes, it leaves the current behavior unaltered (except for APR_DATE_BAD).

So we still may change the Last-Modified sent by a CGI...

>>
>> Shouldn't we either ignore (i.e. not forward) non-GMT, or change it to
>> GMT if it's a valid timezone?
>>
>> The only way to do this would be to use apr_date_parse_rfc() in the first
>> place.
>> If we'd want to ignore non-GMT, compare it to apr_date_parse_http().
>> And only otherwise or if it matches, let it go through ap_update_mtime().
>
> My decision to not use apr_date_parse_rfc came after the long discussion in
> dev@ about whether or not a Last-Modified header coming from a CGI backend
> should be considered HTTP input (so assuming GMT IIUC) or not. Since we
> didn't reach a complete agreement on how to proceed I thought to propose a
> light change as starter (to answer a question raised in users@ a while back
> about warning admins on httpd modifications of the Last-Modified header
> value) and think about more invasive changes in the future, like dropping a
> non GMT header or converting it in GMT if on a valid timezone.

If we really didn't want to modify the (semantic of the )header, we'd
accept other timezones and convert it to GMT.
But dropping or interpreting it as GMT when it's not (i.e. s/<any
timezone>/GMT/ without changing the date) is a modification anyway.

So if we want to stick to the RFC (and it's very strict
interpretation, which I'm not sure it's that strict), either we ignore
or we error on non-GMT (the latter would be an unacceptable change in
behaviour IMHO).

>
>>
>>
>> I don't the difference between this proposal and the current code (but
>> TRACEs), ap_update_mtime() is a noop with APR_DATE_BAD.
>> Did I miss something?
>
>
> This is a very good point, I didn't notice an important part. From what I
> can see ap_update_mtime is indeed not updating anything, but r->mtime is
> already 0 and the ap_set_last_modified picks it up and use it (ending up
> with a Unix epoch datetime string). My check on APR_DATE_BAD avoids the call
> to ap_set_last_modified, this is why the header is dropped.

Yes, I missed ap_set_last_modified() was using ~now when given zero.

> Maybe we also
> need something like the following?
>
>  AP_DECLARE(void) ap_set_last_modified(request_rec *r)
>  {
> -    if (!r->assbackwards) {
> +    if (!r->assbackwards && r->mtime > 0) {
>          apr_time_t mod_time = ap_rationalize_mtime(r, r->mtime);
>          char *datestr = apr_palloc(r->pool, APR_RFC822_DATE_LEN);

That'd change its behaviour, one may currently use it to set
Last-Modified to now, w/o calling ap_update_mtime(), and it used to
work...

Anyway, my personal preference would be to tolerate a different
timezone (than GMT), or at at worse ignore the header (if not GMT).

So I think the patch (against 2.4.x) would be something like :

Index: server/util_script.c
===================================================================
--- server/util_script.c    (revision 1760114)
+++ server/util_script.c    (working copy)
@@ -670,11 +670,26 @@ AP_DECLARE(int) ap_scan_script_header_err_core_ex(
         }
         /*
          * If the script gave us a Last-Modified header, we can't just
-         * pass it on blindly because of restrictions on future values.
+         * pass it on blindly (RFC mandates GMT).  Future values are
+         * handled by ap_set_last_modified() and changed to now.
          */
         else if (!strcasecmp(w, "Last-Modified")) {
-            ap_update_mtime(r, apr_date_parse_http(l));
-            ap_set_last_modified(r);
+            apr_time_t parsed_date = apr_date_parse_rfc(l);
+            if (parsed_date == APR_DATE_BAD) {
+               ap_log_rerror(SCRIPT_LOG_MARK, APLOG_DEBUG, 0, r,
+                             "Ignored Last-Modified header value: '%s'",
+                             l);
+            }
+            else if(parsed_date != apr_date_parse_http(l)) {
+                ap_log_rerror(SCRIPT_LOG_MARK, APLOG_DEBUG, 0, r,
+                              "Ignored Last-Modified header value (not "
+                              "within the GMT timezone, as required): '%s'",
+                              l);
+            }
+            else {
+                ap_update_mtime(r, parsed_date);
+                ap_set_last_modified(r);
+            }
         }
         else if (!strcasecmp(w, "Set-Cookie")) {
             apr_table_add(cookie_table, w, l);
_


Or the lenient mode :

Index: server/util_script.c
===================================================================
--- server/util_script.c    (revision 1760114)
+++ server/util_script.c    (working copy)
@@ -670,11 +670,21 @@ AP_DECLARE(int) ap_scan_script_header_err_core_ex(
         }
         /*
          * If the script gave us a Last-Modified header, we can't just
-         * pass it on blindly because of restrictions on future values.
+         * pass it on blindly (RFC mandates GMT, but be lenient and
+         * transform other timezones to GMT).  Future values are handled
+         * by ap_set_last_modified() and changed to now.
          */
         else if (!strcasecmp(w, "Last-Modified")) {
-            ap_update_mtime(r, apr_date_parse_http(l));
-            ap_set_last_modified(r);
+            apr_time_t parsed_date = apr_date_parse_rfc(l);
+            if (parsed_date == APR_DATE_BAD) {
+               ap_log_rerror(SCRIPT_LOG_MARK, APLOG_DEBUG, 0, r,
+                             "Ignored Last-Modified header value: '%s'",
+                             l);
+            }
+            else {
+                ap_update_mtime(r, parsed_date);
+                ap_set_last_modified(r);
+            }
         }
         else if (!strcasecmp(w, "Set-Cookie")) {
             apr_table_add(cookie_table, w, l);
_


Regards,
Yann.

Re: svn commit: r1757982 - /httpd/httpd/branches/2.4.x/STATUS

Posted by Luca Toscano <to...@gmail.com>.
Hi Yann,

2016-09-10 1:13 GMT+02:00 Yann Ylavic <yl...@gmail.com>:

> On Sat, Aug 27, 2016 at 12:12 PM,  <el...@apache.org> wrote:
> > Author: elukey
> > Date: Sat Aug 27 10:12:23 2016
> > New Revision: 1757982
> >
> > URL: http://svn.apache.org/viewvc?rev=1757982&view=rev
> > Log:
> > Updated backport proposal with the latest discussion on dev@
> > about Last-Modified header handling.
> > I removed jchampion's vote since it was related to a completely
> > different solution.
> >
> >
> > Modified:
> >     httpd/httpd/branches/2.4.x/STATUS
> >
> []
> >
> >    *) core: Drop an invalid Last-Modified header value coming
> > -     from a FCGI/CGI script instead of replacing it with Unix epoch.
> > -     Warn the users about Last-Modified header value replacements and
> > -     improved handling of non-GMT datestr.
> > +     from a (F)CGI script instead of replacing it with Unix epoch.
> > +     Warn the users about Last-Modified header value replacements
> > +     and violations of the RFC.
> >       trunk patch: http://svn.apache.org/r1748379
> >                    http://svn.apache.org/r1750747
> >                    http://svn.apache.org/r1750749
> > @@ -139,11 +139,13 @@ PATCHES PROPOSED TO BACKPORT FROM TRUNK:
> >                    http://svn.apache.org/r1751138
> >                    http://svn.apache.org/r1751139
> >                    http://svn.apache.org/r1751147
> > -     2.4.x: trunk patches works (final view http://apaste.info/9v3)
> > -     The last revision has been discussed in dev@ and submitted by
> Yann.
> > +                  http://svn.apache.org/r1757818
> > +     2.4.x: trunk patches works (final view http://apaste.info/FCs)
> > +     The last revision has been discussed extensively in dev@ and this
> seems to be
> > +     a good compromise for the moment.
> >       Tested the code with a simple PHP script returning different
> Last-Modified
> > -     headers (GMT now, GMT now Europe/Paris, GMT tomorrow, GMT
> yesterday).
> > -     +1: elukey, jchampion
> > +     headers (GMT now, GMT now Europe/Paris, GMT tomorrow, GMT
> yesterday, PST now).
> > +     +1: elukey
>
> As read it, this proposal (http://apaste.info/FCs) does not enforce
> GMT date and will treat any other timezone as if it were GMT (per
> apr_date_parse_http(), which does not look for "GMT" anywhere), right?
>

Yes, it leaves the current behavior unaltered (except for APR_DATE_BAD).


>
> Shouldn't we either ignore (i.e. not forward) non-GMT, or change it to
> GMT if it's a valid timezone?
>
> The only way to do this would be to use apr_date_parse_rfc() in the first
> place.
> If we'd want to ignore non-GMT, compare it to apr_date_parse_http().
> And only otherwise or if it matches, let it go through ap_update_mtime().
>

My decision to not use apr_date_parse_rfc came after the long discussion in
dev@ about whether or not a Last-Modified header coming from a CGI backend
should be considered HTTP input (so assuming GMT IIUC) or not. Since we
didn't reach a complete agreement on how to proceed I thought to propose a
light change as starter (to answer a question raised in users@ a while back
about warning admins on httpd modifications of the Last-Modified header
value) and think about more invasive changes in the future, like dropping a
non GMT header or converting it in GMT if on a valid timezone.


>
> I don't the difference between this proposal and the current code (but
> TRACEs), ap_update_mtime() is a noop with APR_DATE_BAD.
> Did I miss something?
>

This is a very good point, I didn't notice an important part. From what I
can see ap_update_mtime is indeed not updating anything, but r->mtime is
already 0 and the ap_set_last_modified picks it up and use it (ending up
with a Unix epoch datetime string). My check on APR_DATE_BAD avoids the
call to ap_set_last_modified, this is why the header is dropped. Maybe we
also need something like the following?

 AP_DECLARE(void) ap_set_last_modified(request_rec *r)
 {
-    if (!r->assbackwards) {
+    if (!r->assbackwards && r->mtime > 0) {
         apr_time_t mod_time = ap_rationalize_mtime(r, r->mtime);
         char *datestr = apr_palloc(r->pool, APR_RFC822_DATE_LEN);


Thanks for following up!

Regards,

Luca