You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@httpd.apache.org by "@lbutlr" <kr...@kreme.com> on 2018/10/03 17:39:31 UTC

[users@httpd] 0 length robot.txt

This is probably a coincidence, but I had one of my hosted sites (with no php code anywhere, and certainly no .php files) returning a script error on load instead of showing the non-php webpage:

[proxy_fcgi:error] [pid 88148] [client xx.xx.xx.xx:63137] AH01071: Got error 'Primary script unknown\n’

And it would display a blank page for a few seconds, then “File Not Found” would appear. There was no HTTP error code.

Other hosted sites, also not using php, didn’t have this problem, and sites that did use php were working fine.

All sites are configured to allow php, and have an fcgi in their configuration:

   DocumentRoot ${WEBROOT}
   ProxyPassMatch ^/(.*\.php)$ fcgi://127.0.0.1:9000/${WEBROOT}$1

On the problem site, if I commented out the ProxyPassMatch line and reloaded apache, the site would load.

Confusing.

Not being sure what was causing this one one specific site, I started comparing the directory structure, .htaccess, and anything I could look at to see what was different about this particular site and I noticed that it had a zero length robots.txt file in the webroot, which no other site had.

Removing that file made the site load properly.

I still get many script errors in the error log, but these mostly have a referer [sic] at the end and are obviously attempts to hack into the page:

[proxy_fcgi:error] [pid 42901] [client 178.137.92.187:61783] AH01071: Got error 'Primary script unknown\n', referer: /www/XXX/license.php

[proxy_fcgi:error] [pid 18168] [client 74.71.9.14:59624] AH01071: Got error 'Primary script unknown\n'
[core:info] [pid 43056] [client 74.71.9.14:59637] AH00128: File does not exist: /www/XXX/apple-touch-icon-120x120-precomposed.png

But I still get a few bare ones:

[Wed Oct 03 08:13:05.504129 2018] [proxy_fcgi:error] [pid 43364] [client 74.71.9.14:57753] AH01071: Got error 'Primary script unknown\n'
[Wed Oct 03 09:17:36.194394 2018] [proxy_fcgi:error] [pid 42840] [client 54.36.148.74:60192] AH01071: Got error 'Primary script unknown\n'
[Wed Oct 03 10:08:08.834583 2018] [proxy_fcgi:error] [pid 18168] [client 74.71.9.14:59624] AH01071: Got error 'Primary script unknown\n'
[Wed Oct 03 10:17:17.791282 2018] [proxy_fcgi:error] [pid 43056] [client 180.76.15.30:29494] AH01071: Got error 'Primary script unknown\n'
[Wed Oct 03 10:40:17.322634 2018] [proxy_fcgi:error] [pid 42840] [client 74.71.9.14:64211] AH01071: Got error 'Primary script unknown\n'
[Wed Oct 03 11:27:58.098639 2018] [proxy_fcgi:error] [pid 18168] [client 202.46.50.182:22728] AH01071: Got error 'Primary script unknown\n'
[Wed Oct 03 11:33:13.054967 2018] [proxy_fcgi:error] [pid 43056] [client 123.125.71.77:21018] AH01071: Got error 'Primary script unknown\n'

(The 74.71.9.14 IP has made more than 50,000 requests for non-existent files on this hosted domain, sadly it is a residential IP from RoadRunner, and they are worthless to deal with, regardless of how often they change their company name)

Why would a blank robots.txt cause this issue? Or is there something else going on here and this is just a weird coincidence?

-- 
There's a race of men that don't fit in, A race that can't stay still So
they break the hearts of kith and kin, And they roam the world at will.



---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@httpd.apache.org
For additional commands, e-mail: users-help@httpd.apache.org


[users@httpd] Re: 0 length robot.txt

Posted by "@lbutlr" <kr...@kreme.com>.
On 04 Oct 2018, at 13:20, Filipe Cifali <ci...@gmail.com> wrote:
> And the docs, this project is open source, we can change (or rather, propose changes) to documentation anytime we want.

Sure, but first you have to figure out the multiple layers of complexity in the current docs.


-- 
Boy, it sure would be nice if we had some grenades, don'tcha think?


---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@httpd.apache.org
For additional commands, e-mail: users-help@httpd.apache.org


Re: [users@httpd] Re: 0 length robot.txt

Posted by Filipe Cifali <ci...@gmail.com>.
It's a bit strange to say that considering there is a page covering the
changes from 2.2 to 2.4:

https://httpd.apache.org/docs/2.4/upgrading.html

And the docs, this project is open source, we can change (or rather,
propose changes) to documentation anytime we want.

On Thu, Oct 4, 2018 at 3:54 PM @lbutlr <kr...@kreme.com> wrote:

> On 04 Oct 2018, at 11:50, Filipe Cifali <ci...@gmail.com> wrote:
> > You want to use a CustomLog for virtualhost config to gather the most
> info you can from the request:
> >
> > https://httpd.apache.org/docs/current/mod/mod_log_config.html#customlog
>
> Ugh. That is a terrible bit of documentation written by and for people who
> don’t need documentation.
>
> It would be nice if there was something that clearly explained all of
> this, especially considering how it’s changed since 2.2.
>
> I’ve enabled the proxy and set CustomLog /path/log debug
>
> Everything has been working for a bit now; this is annoying. :/
>
> --
> FRIDAYS ARE NOT "PANTS OPTIONAL" Bart chalkboard Ep. AABF23
>
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: users-unsubscribe@httpd.apache.org
> For additional commands, e-mail: users-help@httpd.apache.org
>
>

-- 
[ ]'s

Filipe Cifali Stangler

[users@httpd] Re: 0 length robot.txt

Posted by "@lbutlr" <kr...@kreme.com>.
On 04 Oct 2018, at 11:50, Filipe Cifali <ci...@gmail.com> wrote:
> You want to use a CustomLog for virtualhost config to gather the most info you can from the request:
> 
> https://httpd.apache.org/docs/current/mod/mod_log_config.html#customlog

Ugh. That is a terrible bit of documentation written by and for people who don’t need documentation.

It would be nice if there was something that clearly explained all of this, especially considering how it’s changed since 2.2.

I’ve enabled the proxy and set CustomLog /path/log debug

Everything has been working for a bit now; this is annoying. :/

-- 
FRIDAYS ARE NOT "PANTS OPTIONAL" Bart chalkboard Ep. AABF23



---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@httpd.apache.org
For additional commands, e-mail: users-help@httpd.apache.org


Re: [users@httpd] Re: 0 length robot.txt

Posted by Filipe Cifali <ci...@gmail.com>.
You want to use a CustomLog for virtualhost config to gather the most info
you can from the request:

https://httpd.apache.org/docs/current/mod/mod_log_config.html#customlog

Also, read the *Context* so you know where you can use them:

https://httpd.apache.org/docs/2.4/mod/core.html#LogLevel



On Thu, Oct 4, 2018 at 2:46 PM @lbutlr <kr...@kreme.com> wrote:

> On 03 Oct 2018, at 18:27, Filipe Cifali <ci...@gmail.com> wrote:
> > you can for example turn log level to debug and access the site, tailing
> the logs should provide some information about what is breaking.
>
> Is it possible to set the log level just for a virtual host? I thought
> that was a server-wide setting. I tried adding
>
> LogLevel warn rewrite:trace8
>
> to the virtual host and didn’t get an error on starting apache, but the
> http-error log for the site didn’t appear any different.
>
> > Also, why you have a ProxyPass on a virtualhost that doesn't run
> anything PHP? Create a template without the config and use it.
>
> All the sites are setup for php so that I don’t have to get an email, go
> edit a file, and restart apache just because someone wants to put some php
> code in their page.
>
> At least today it is failing immediately, so debugging should be easier.
>
> --
> @mdhughes: One of the few regrets I have about lawn-less apartments:
> Shallow graves are so much harder to come by.
>
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: users-unsubscribe@httpd.apache.org
> For additional commands, e-mail: users-help@httpd.apache.org
>
>

-- 
[ ]'s

Filipe Cifali Stangler

[users@httpd] Re: 0 length robot.txt

Posted by "@lbutlr" <kr...@kreme.com>.
On 03 Oct 2018, at 18:27, Filipe Cifali <ci...@gmail.com> wrote:
> you can for example turn log level to debug and access the site, tailing the logs should provide some information about what is breaking.

Is it possible to set the log level just for a virtual host? I thought that was a server-wide setting. I tried adding 

LogLevel warn rewrite:trace8

to the virtual host and didn’t get an error on starting apache, but the http-error log for the site didn’t appear any different.

> Also, why you have a ProxyPass on a virtualhost that doesn't run anything PHP? Create a template without the config and use it. 

All the sites are setup for php so that I don’t have to get an email, go edit a file, and restart apache just because someone wants to put some php code in their page.

At least today it is failing immediately, so debugging should be easier.

-- 
@mdhughes: One of the few regrets I have about lawn-less apartments:
Shallow graves are so much harder to come by.



---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@httpd.apache.org
For additional commands, e-mail: users-help@httpd.apache.org


Re: [users@httpd] Re: 0 length robot.txt

Posted by Filipe Cifali <ci...@gmail.com>.
Lewis,

you can for example turn log level to debug and access the site, tailing
the logs should provide some information about what is breaking. Also, why
you have a ProxyPass on a virtualhost that doesn't run anything PHP? Create
a template without the config and use it.

On Wed, Oct 3, 2018 at 8:11 PM @lbutlr <kr...@kreme.com> wrote:

> On 03 Oct 2018, at 12:27, @lbutlr <kr...@kreme.com> wrote:
> > There is exactly one line in the site configuration that, when
> commented, makes the site work again. Though, possibly only for a little
> while. I’ll have to check more in 3-4 hours. There is no other proxy logic
> at all.
>
> It’ been over 4 hours now (almost 5) and the site is still responding
> perfectly. I still have no idea what is causing it to break if I uncomment
> the ProxyPass line considering there is no php anywhere on the site other
> than a couple of href to external sites.
>
> --
> "What's a Velvet Underground?" "You wouldn't like it." "Oh, Be-bop.”
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: users-unsubscribe@httpd.apache.org
> For additional commands, e-mail: users-help@httpd.apache.org
>
>

-- 
[ ]'s

Filipe Cifali Stangler

[users@httpd] Re: 0 length robot.txt

Posted by "@lbutlr" <kr...@kreme.com>.
On 06 Oct 2018, at 17:59, Filipe Cifali <ci...@gmail.com> wrote:
> It's described on the CustomLog docs: https://httpd.apache.org/docs/current/mod/mod_log_config.html#customlog
> 
> "The second argument specifies what will be written to the log file. It can specify either a ***nickname*** defined by a previous LogFormat directive, or it can be an explicit ***format*** string as described in the log formats section. “

Yes, I know this. The oddity is simply that changing it to, essentially, a nonsense setting has prevented the site from crashing exactly like disabling the Proxy prevented the site from crashing.

-- 
Just give us a kiss to celebrate here, today.


---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@httpd.apache.org
For additional commands, e-mail: users-help@httpd.apache.org


Re: [users@httpd] Re: 0 length robot.txt

Posted by Filipe Cifali <ci...@gmail.com>.
It's described on the CustomLog docs:
https://httpd.apache.org/docs/current/mod/mod_log_config.html#customlog

"The second argument specifies what will be written to the log file. It can
specify either a ***nickname*** defined by a previous LogFormat
<https://httpd.apache.org/docs/current/mod/mod_log_config.html#logformat>
directive, or it can be an explicit ***format*** string as described in the log
formats
<https://httpd.apache.org/docs/current/mod/mod_log_config.html#formats>
section. "

Either you use this way:
  LogFormat "%h %l %u %t \"%r\" %>s %b \"%{Referer}i\" \"%{User-Agent}i\""
combined
  CustomLog /home/user/logs/XXX.access_log combined

Or this way:
  CustomLog "/home/user/logs/XXX.access_log"  "%h %l %u %t \"%r\" %>s %b
\"%{Referer}i\" \"%{User-Agent}i\""

You see, "combined" is just a nickname to the LogFormat, you can add
something like "my-site-special-log-format" and as long as you call it on
the CustomLog it will work, cause it's just an alias.

On Sat, Oct 6, 2018 at 8:51 PM @lbutlr <kr...@kreme.com> wrote:

> On 03 Oct 2018, at 17:11, @lbutlr <kr...@kreme.com> wrote:
> > It’ been over 4 hours now (almost 5) and the site is still responding
> perfectly.
>
> Well, I am more confused. I changed the log from common to debug and the
> site has been fine for days now.
>
> -  CustomLog /home/user/logs/XXX.access_log combined
> +  CustomLog /home/user/logs/XXX.access_log debug
>
> This was a mistake, as it simply logs “debug” now, so the logs are
> useless, but the site is up.
>
> In https.conf:
>  LogFormat "%h %l %u %t \"%r\" %>s %b \"%{Referer}i\" \"%{User-Agent}i\""
> combined
>
> ¯\_(ツ)_/¯
>
> --
> ALL WORK AND NO PLAY MAKES BART A DULL BOY ALL WORK AND NO PLAY MAKES
> BART A DULL BOY ALL WORK AND NO PLAY MAKES BART A DULL BOY Bart
> chalkboard Ep. 1F07er}i\" \"%{User-Age
> nt}i\"” combined
>
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: users-unsubscribe@httpd.apache.org
> For additional commands, e-mail: users-help@httpd.apache.org
>
>

-- 
[ ]'s

Filipe Cifali Stangler

[users@httpd] Re: 0 length robot.txt

Posted by "@lbutlr" <kr...@kreme.com>.
On 03 Oct 2018, at 17:11, @lbutlr <kr...@kreme.com> wrote:
> It’ been over 4 hours now (almost 5) and the site is still responding perfectly. 

Well, I am more confused. I changed the log from common to debug and the site has been fine for days now.

-  CustomLog /home/user/logs/XXX.access_log combined
+  CustomLog /home/user/logs/XXX.access_log debug

This was a mistake, as it simply logs “debug” now, so the logs are useless, but the site is up. 

In https.conf:
 LogFormat "%h %l %u %t \"%r\" %>s %b \"%{Referer}i\" \"%{User-Agent}i\"" combined

¯\_(ツ)_/¯ 

-- 
ALL WORK AND NO PLAY MAKES BART A DULL BOY ALL WORK AND NO PLAY MAKES
BART A DULL BOY ALL WORK AND NO PLAY MAKES BART A DULL BOY Bart
chalkboard Ep. 1F07er}i\" \"%{User-Age
nt}i\"” combined



---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@httpd.apache.org
For additional commands, e-mail: users-help@httpd.apache.org


[users@httpd] Re: 0 length robot.txt

Posted by "@lbutlr" <kr...@kreme.com>.
On 03 Oct 2018, at 12:27, @lbutlr <kr...@kreme.com> wrote:
> There is exactly one line in the site configuration that, when commented, makes the site work again. Though, possibly only for a little while. I’ll have to check more in 3-4 hours. There is no other proxy logic at all.

It’ been over 4 hours now (almost 5) and the site is still responding perfectly. I still have no idea what is causing it to break if I uncomment the ProxyPass line considering there is no php anywhere on the site other than a couple of href to external sites.

-- 
"What's a Velvet Underground?" "You wouldn't like it." "Oh, Be-bop.”


---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@httpd.apache.org
For additional commands, e-mail: users-help@httpd.apache.org


[users@httpd] Re: 0 length robot.txt

Posted by "@lbutlr" <kr...@kreme.com>.
On 03 Oct 2018, at 12:07, Filipe Cifali <ci...@gmail.com> wrote:
> you can check what virtualhost is being served via apache2ctl like this: $ apache2ctl -S
> $ apache2ctl -h provides this info:
>   -S                 : a synonym for -t -D DUMP_VHOSTS -D DUMP_RUN_CFG

Yes that is all fine, and the site was loading perfectly for almost three and a half hours.

         port 443 namevhost www.XXX.com (/usr/local/etc/apache24/users/XXX.conf:1)
                 alias XXX.com
         port 80 namevhost www.XXX.com (/usr/local/etc/apache24/users/XXX,conf:26)
                 alias XXX.com

I do not have an apache2ctl, just apachectl (apache 2.4 FreeBSD 11.2-REALEASE compiled from ports)

> After checking that the right vhost is being served, start removing proxy logic and just make the txt work again, then slowly start adding the proxy config to make the php work again. 

There is exactly one line in the site configuration that, when commented, makes the site work again. Though, possibly only for a little while. I’ll have to check more in 3-4 hours. There is no other proxy logic at all.

> If you can, post the full vhost here regarding the domain that misbehaves. 

Sure, but other than the host name, it is identical to all the other sites.

<VirtualHost *:443>
   ServerName www.XXX
   ServerAlias XXX
   DocumentRoot /www/XXX/
   #ProxyPassMatch ^/(.*\.php)$ fcgi://127.0.0.1:9000/www/XXX/$1
   <Directory "/www/XXX/">
     Options +Indexes +FollowSymLinks +MultiViews -SymLinksIfOwnerMatch
     AllowOverride all
     Require all granted
   </Directory>
   SSLEngine on
    SSLCertificateFile /usr/local/etc/dehydrated/certs/XXX/cert.pem
    SSLCertificateKeyFile /usr/local/etc/dehydrated/certs/XXX/privkey.pem
    SSLCertificateChainFile /usr/local/etc/dehydrated/certs/XXX/chain.pem
   SSLProtocol ALL -SSLv2 -SSLv3
   SSLHonorCipherOrder on
   SSLCipherSuite ECDH+AESGCM:DH+AESGCM:ECDH+AES256:DH+AES256:ECDH+AES128:DH+AES:ECDH+3DES:DH+3DES:RSA+AESGCM:RSA+AES:RSA+3DES:!aNULL:!MD5:!DSS
   # 15638400 seconds is 181 dayds
   # 63072000 seconds is 730 days
   Header always set Strict-Transport-Security "max-age=15638400; includeSubdomains;"
   Header always set X-Frame-Options DENY
   ErrorLog /home/user1/logs/XXX.error_log
   CustomLog /home/user1/logs/XXX.access_log combined
</VirtualHost>


> The important part is: Having a zeroed robots.txt doesn't break httpd.

Yeah, it didn’t seem likely, but then again it seemed to work for q bit…

And, just for kicks:
# apachectl -M
Loaded Modules:
 core_module (static)
 so_module (static)
 http_module (static)
 authn_file_module (shared)
 mpm_prefork_module (shared)
 authn_dbm_module (shared)
 authn_core_module (shared)
 authz_host_module (shared)
 authz_groupfile_module (shared)
 authz_user_module (shared)
 authz_dbm_module (shared)
 authz_core_module (shared)
 access_compat_module (shared)
 auth_basic_module (shared)
 auth_digest_module (shared)
 socache_shmcb_module (shared)
 socache_dbm_module (shared)
 reqtimeout_module (shared)
 include_module (shared)
 filter_module (shared)
 mime_module (shared)
 log_config_module (shared)
 env_module (shared)
 headers_module (shared)
 setenvif_module (shared)
 version_module (shared)
 proxy_module (shared)
 proxy_fcgi_module (shared)
 ssl_module (shared)
 unixd_module (shared)
 dav_module (shared)
 status_module (shared)
 autoindex_module (shared)
 cgi_module (shared)
 dav_fs_module (shared)
 vhost_alias_module (shared)
 dir_module (shared)
 userdir_module (shared)
 alias_module (shared)
 rewrite_module (shared)

# cat /www/XXX/.htaccess
Options +Includes +FollowSymLinks +MultiViews

-- 
One tequila, two tequila, three tequila, floor.



---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@httpd.apache.org
For additional commands, e-mail: users-help@httpd.apache.org


Re: [users@httpd] Re: 0 length robot.txt

Posted by Filipe Cifali <ci...@gmail.com>.
Hi Kremels,

you can check what virtualhost is being served via apache2ctl like this: $
apache2ctl -S
$ apache2ctl -h provides this info:
  -S                 : a synonym for -t -D DUMP_VHOSTS -D DUMP_RUN_CFG

After checking that the right vhost is being served, start removing proxy
logic and just make the txt work again, then slowly start adding the proxy
config to make the php work again.

If you can, post the full vhost here regarding the domain that misbehaves.

The important part is: Having a zeroed robots.txt doesn't break httpd.

On Wed, Oct 3, 2018 at 2:59 PM @lbutlr <kr...@kreme.com> wrote:

> On 03 Oct 2018, at 11:39, @lbutlr <kr...@kreme.com> wrote:
> > Removing that file made the site load properly.
>
> Well, it did for about 3h25 minutes, in fact.
>
> Just after posting the message, the site went back to showing only “File
> Not Found”
>
> I’m at a loss.
>
> The only other issue I see is in the main http-error log there are
> repeated instance of:
>
> [ssl:info] [pid 43234] (70014)End of file found: [client 106.45.1.92:48564]
> AH01991: SSL input filter read failed.
>
> (From various client addresses)
>
> The site in question gets a grade of A+ from SSL Labs, and this error
> message appears to be somewhat spurious in nature as apache tries to use
> the default cert for the site before getting the server name, then loads
> the correct cert, so I don’t think this is really an issue.
>
> --
> Han : This is not going to work.
> Luke: Why didn't you say so before?
> Han : I did say so before!
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: users-unsubscribe@httpd.apache.org
> For additional commands, e-mail: users-help@httpd.apache.org
>
>

-- 
[ ]'s

Filipe Cifali Stangler

[users@httpd] Re: 0 length robot.txt

Posted by "@lbutlr" <kr...@kreme.com>.
On 03 Oct 2018, at 11:39, @lbutlr <kr...@kreme.com> wrote:
> Removing that file made the site load properly.

Well, it did for about 3h25 minutes, in fact.

Just after posting the message, the site went back to showing only “File Not Found”

I’m at a loss.

The only other issue I see is in the main http-error log there are repeated instance of:

[ssl:info] [pid 43234] (70014)End of file found: [client 106.45.1.92:48564] AH01991: SSL input filter read failed.

(From various client addresses)

The site in question gets a grade of A+ from SSL Labs, and this error message appears to be somewhat spurious in nature as apache tries to use the default cert for the site before getting the server name, then loads the correct cert, so I don’t think this is really an issue.

-- 
Han : This is not going to work.
Luke: Why didn't you say so before?
Han : I did say so before!


---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@httpd.apache.org
For additional commands, e-mail: users-help@httpd.apache.org