You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@nutch.apache.org by si...@apache.org on 2006/10/24 17:21:43 UTC
svn commit: r467355 - in /lucene/nutch/trunk: CHANGES.txt
src/java/org/apache/nutch/parse/ParseUtil.java
Author: siren
Date: Tue Oct 24 08:21:43 2006
New Revision: 467355
URL: http://svn.apache.org/viewvc?view=rev&rev=467355
Log:
fix for NUTCH-379
Modified:
lucene/nutch/trunk/CHANGES.txt
lucene/nutch/trunk/src/java/org/apache/nutch/parse/ParseUtil.java
Modified: lucene/nutch/trunk/CHANGES.txt
URL: http://svn.apache.org/viewvc/lucene/nutch/trunk/CHANGES.txt?view=diff&rev=467355&r1=467354&r2=467355
==============================================================================
--- lucene/nutch/trunk/CHANGES.txt (original)
+++ lucene/nutch/trunk/CHANGES.txt Tue Oct 24 08:21:43 2006
@@ -50,9 +50,6 @@
17. NUTCH-383 - upgrade to Hadoop 0.7.1 and Lucene 2.0.0. (ab)
-18. NUTCH-391 - ParseUtil logs file contents to log file when it cannot
- find parser (siren)
-
****************************** WARNING !!! ********************************
* This upgrade breaks data format compatibility. A tool 'convertdb' *
* was added to migrate existing CrawlDb-s to the new format. Segment data *
@@ -63,6 +60,11 @@
18. NUTCH-371 - DeleteDuplicates now correctly implements both parts of
the algorithm. (ab)
+19. NUTCH-391 - ParseUtil logs file contents to log file when it cannot
+ find parser (siren)
+
+20. NUTCH-391 - ParseUtil does not pass through the content's URL to the
+ ParserFactory (Chris A. Mattmann via siren)
Release 0.8 - 2006-07-25
Modified: lucene/nutch/trunk/src/java/org/apache/nutch/parse/ParseUtil.java
URL: http://svn.apache.org/viewvc/lucene/nutch/trunk/src/java/org/apache/nutch/parse/ParseUtil.java?view=diff&rev=467355&r1=467354&r2=467355
==============================================================================
--- lucene/nutch/trunk/src/java/org/apache/nutch/parse/ParseUtil.java (original)
+++ lucene/nutch/trunk/src/java/org/apache/nutch/parse/ParseUtil.java Tue Oct 24 08:21:43 2006
@@ -65,7 +65,8 @@
Parser[] parsers = null;
try {
- parsers = this.parserFactory.getParsers(content.getContentType(), "");
+ parsers = this.parserFactory.getParsers(content.getContentType(),
+ content.getUrl() != null ? content.getUrl():"");
} catch (ParserNotFound e) {
if (LOG.isWarnEnabled()) {
LOG.warn("No suitable parser found when trying to parse content " + content.getUrl() +