You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@nutch.apache.org by "Sebastian Nagel (JIRA)" <ji...@apache.org> on 2014/01/28 09:55:37 UTC
[jira] [Created] (NUTCH-1718) update description of property
http.robots.agent
Sebastian Nagel created NUTCH-1718:
--------------------------------------
Summary: update description of property http.robots.agent
Key: NUTCH-1718
URL: https://issues.apache.org/jira/browse/NUTCH-1718
Project: Nutch
Issue Type: Bug
Components: fetcher
Affects Versions: 2.2.1, 2.2, 1.7
Reporter: Sebastian Nagel
Priority: Trivial
Fix For: 2.3, 1.8
The description of property http.robots.agent in nutch-default.xml recommends to add a '*' to the list of agent names. This will cause the same problem as described in NUTCH-1715. The description should be updated. Also regarding "order of precedence" which is dictated since NUTCH-1031 only by ordering of user agents in robots.txt.
{code:xml}
<property>
<name>http.robots.agents</name>
<value>*</value>
<description>The agent strings we'll look for in robots.txt files,
comma-separated, in decreasing order of precedence. You should
put the value of http.agent.name as the first agent name, and keep the
default * at the end of the list. E.g.: BlurflDev,Blurfl,*
</description>
</property>
{code}
--
This message was sent by Atlassian JIRA
(v6.1.5#6160)