You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@nutch.apache.org by "Sebastian Nagel (JIRA)" <ji...@apache.org> on 2014/01/28 09:55:37 UTC

[jira] [Created] (NUTCH-1718) update description of property http.robots.agent

Sebastian Nagel created NUTCH-1718:
--------------------------------------

             Summary: update description of property http.robots.agent
                 Key: NUTCH-1718
                 URL: https://issues.apache.org/jira/browse/NUTCH-1718
             Project: Nutch
          Issue Type: Bug
          Components: fetcher
    Affects Versions: 2.2.1, 2.2, 1.7
            Reporter: Sebastian Nagel
            Priority: Trivial
             Fix For: 2.3, 1.8


The description of property http.robots.agent in nutch-default.xml recommends to add a '*' to the list of agent names. This will cause the same problem as described in NUTCH-1715. The description should be updated. Also regarding "order of precedence" which is dictated since NUTCH-1031 only by ordering of user agents in robots.txt.
{code:xml}
<property>
  <name>http.robots.agents</name>
  <value>*</value>
  <description>The agent strings we'll look for in robots.txt files,
  comma-separated, in decreasing order of precedence. You should
  put the value of http.agent.name as the first agent name, and keep the
  default * at the end of the list. E.g.: BlurflDev,Blurfl,*
  </description>
</property>
{code}



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)