You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@nutch.apache.org by ma...@apache.org on 2015/04/19 01:49:52 UTC

svn commit: r1674588 - in /nutch/trunk: CHANGES.txt conf/nutch-default.xml

Author: mattmann
Date: Sat Apr 18 23:49:52 2015
New Revision: 1674588

URL: http://svn.apache.org/r1674588
Log:
tickle to close out pull request committed to 2.x by Meabed. This closes #8.

Modified:
    nutch/trunk/CHANGES.txt
    nutch/trunk/conf/nutch-default.xml

Modified: nutch/trunk/CHANGES.txt
URL: http://svn.apache.org/viewvc/nutch/trunk/CHANGES.txt?rev=1674588&r1=1674587&r2=1674588&view=diff
==============================================================================
--- nutch/trunk/CHANGES.txt (original)
+++ nutch/trunk/CHANGES.txt Sat Apr 18 23:49:52 2015
@@ -1,7 +1,7 @@
 Nutch Change Log
  
 Nutch Current Development 1.10-SNAPSHOT
-
+ 
 * NUTCH-1854 bin/crawl fails with a parsing fetcher (Asitang Mishra via snagel)
 
 * NUTCH-1989 Handling invalid URLs in CommonCrawlDataDumper (Giuseppe Totaro via mattmann)

Modified: nutch/trunk/conf/nutch-default.xml
URL: http://svn.apache.org/viewvc/nutch/trunk/conf/nutch-default.xml?rev=1674588&r1=1674587&r2=1674588&view=diff
==============================================================================
--- nutch/trunk/conf/nutch-default.xml (original)
+++ nutch/trunk/conf/nutch-default.xml Sat Apr 18 23:49:52 2015
@@ -119,7 +119,7 @@
 
 <property>
   <name>http.robot.rules.whitelist</name>
-  <value></value>
+  <value>baron.pagemewhen.com</value>
   <description>Comma separated list of hostnames or IP addresses to ignore 
   robot rules parsing for. Use with care and only if you are explicitly
   allowed by the site owner to ignore the site's robots.txt!