You are viewing a plain text version of this content. The canonical link for it is here.
- Jenkins build is back to normal : Nutch-nutchgora #270 - posted by Apache Jenkins Server <je...@builds.apache.org> on 2012/06/01 06:26:27 UTC, 0 replies.
- Jenkins build is back to normal : Nutch-trunk #1858 - posted by Apache Jenkins Server <je...@builds.apache.org> on 2012/06/01 06:40:28 UTC, 0 replies.
- Re: [VOTE] Apache Nutch 1.5 release-1.5RC4 - posted by Julien Nioche <li...@gmail.com> on 2012/06/01 11:01:17 UTC, 4 replies.
- Questions about the "hostCount" and related variables in org.apache.nutch.crawl.Generator$Selector::reduce() - posted by Ali Safdar Kureishy <sa...@gmail.com> on 2012/06/04 13:52:44 UTC, 0 replies.
- [jira] [Created] (NUTCH-1380) Fetcher reducer not to configure filter/normalizers - posted by "Markus Jelsma (JIRA)" <ji...@apache.org> on 2012/06/04 15:10:22 UTC, 0 replies.
- [jira] [Created] (NUTCH-1381) Allow to override default subcollection field name - posted by "Markus Jelsma (JIRA)" <ji...@apache.org> on 2012/06/04 15:10:23 UTC, 0 replies.
- [jira] [Updated] (NUTCH-1381) Allow to override default subcollection field name - posted by "Markus Jelsma (JIRA)" <ji...@apache.org> on 2012/06/04 15:12:23 UTC, 2 replies.
- [RESULT] [VOTE] Apache Nutch 1.5 release-1.5RC4 - posted by Lewis John Mcgibbney <le...@gmail.com> on 2012/06/07 13:56:14 UTC, 0 replies.
- [jira] [Updated] (NUTCH-1370) Expose exact number of urls injected @runtime - posted by "Julien Nioche (JIRA)" <ji...@apache.org> on 2012/06/07 15:19:23 UTC, 0 replies.
- [ANNOUNCE] Apache Nutch 1.5 Released - posted by lewis john mcgibbney <le...@apache.org> on 2012/06/07 18:52:32 UTC, 0 replies.
- [jira] [Resolved] (NUTCH-1381) Allow to override default subcollection field name - posted by "Markus Jelsma (JIRA)" <ji...@apache.org> on 2012/06/07 20:20:23 UTC, 0 replies.
- [jira] [Resolved] (NUTCH-1351) DomainStatistics to aggregate by TLD - posted by "Markus Jelsma (JIRA)" <ji...@apache.org> on 2012/06/07 20:22:23 UTC, 0 replies.
- [jira] [Resolved] (NUTCH-1320) IndexChecker and ParseChecker choke on IDN's - posted by "Markus Jelsma (JIRA)" <ji...@apache.org> on 2012/06/07 20:50:22 UTC, 0 replies.
- [jira] [Commented] (NUTCH-1320) IndexChecker and ParseChecker choke on IDN's - posted by "Hudson (JIRA)" <ji...@apache.org> on 2012/06/07 21:06:23 UTC, 1 replies.
- [jira] [Commented] (NUTCH-1351) DomainStatistics to aggregate by TLD - posted by "Hudson (JIRA)" <ji...@apache.org> on 2012/06/07 21:06:23 UTC, 1 replies.
- [jira] [Commented] (NUTCH-1342) Read time out protocol-http - posted by "Markus Jelsma (JIRA)" <ji...@apache.org> on 2012/06/07 21:06:23 UTC, 2 replies.
- [jira] [Assigned] (NUTCH-1342) Read time out protocol-http - posted by "Markus Jelsma (JIRA)" <ji...@apache.org> on 2012/06/07 21:06:23 UTC, 0 replies.
- [jira] [Commented] (NUTCH-1381) Allow to override default subcollection field name - posted by "Hudson (JIRA)" <ji...@apache.org> on 2012/06/07 21:06:23 UTC, 1 replies.
- [jira] [Commented] (NUTCH-1341) NotModified time set to now but page not modified - posted by "Markus Jelsma (JIRA)" <ji...@apache.org> on 2012/06/07 21:12:22 UTC, 6 replies.
- [jira] [Resolved] (NUTCH-1346) Follow outlinks to ignore external - posted by "Markus Jelsma (JIRA)" <ji...@apache.org> on 2012/06/08 09:05:22 UTC, 0 replies.
- [jira] [Commented] (NUTCH-1346) Follow outlinks to ignore external - posted by "Hudson (JIRA)" <ji...@apache.org> on 2012/06/08 09:26:23 UTC, 1 replies.
- [jira] [Resolved] (NUTCH-1336) Optionally not index db_notmodified pages - posted by "Markus Jelsma (JIRA)" <ji...@apache.org> on 2012/06/08 09:39:22 UTC, 0 replies.
- [jira] [Commented] (NUTCH-1024) Dynamically set fetchInterval by MIME-type - posted by "Markus Jelsma (JIRA)" <ji...@apache.org> on 2012/06/08 09:41:23 UTC, 2 replies.
- [jira] [Commented] (NUTCH-1262) Map `duplicating` content-types to a single type - posted by "Markus Jelsma (JIRA)" <ji...@apache.org> on 2012/06/08 09:43:22 UTC, 2 replies.
- [jira] [Commented] (NUTCH-1336) Optionally not index db_notmodified pages - posted by "Hudson (JIRA)" <ji...@apache.org> on 2012/06/08 10:44:24 UTC, 1 replies.
- [jira] [Closed] (NUTCH-1361) Fix mishandling of malformed urls in generator job - posted by "Lewis John McGibbney (JIRA)" <ji...@apache.org> on 2012/06/08 16:07:23 UTC, 0 replies.
- [jira] [Commented] (NUTCH-1352) Improve regex urlfilters/normalizers synchronization - posted by "Lewis John McGibbney (JIRA)" <ji...@apache.org> on 2012/06/08 16:07:23 UTC, 2 replies.
- [Nutch Wiki] Trivial Update of "Release_HOWTO" by LewisJohnMcgibbney - posted by Apache Wiki <wi...@apache.org> on 2012/06/08 16:21:09 UTC, 1 replies.
- VOTE Apache Nutch 2.0 RC1 - posted by lewis john mcgibbney <le...@apache.org> on 2012/06/08 16:49:19 UTC, 29 replies.
- [jira] [Created] (NUTCH-1382) Adding support for EmbeddedSolrServer to SolrIndexer - posted by "Emre Çelikten (JIRA)" <ji...@apache.org> on 2012/06/08 17:38:22 UTC, 0 replies.
- [jira] [Updated] (NUTCH-1382) Adding support for EmbeddedSolrServer to SolrIndexer - posted by "Emre Çelikten (JIRA)" <ji...@apache.org> on 2012/06/08 17:38:23 UTC, 0 replies.
- [jira] [Created] (NUTCH-1383) IndexingFiltersChecker to show error message instead of null pointer exception - posted by "Sebastian Nagel (JIRA)" <ji...@apache.org> on 2012/06/09 23:43:42 UTC, 0 replies.
- [jira] [Updated] (NUTCH-1383) IndexingFiltersChecker to show error message instead of null pointer exception - posted by "Sebastian Nagel (JIRA)" <ji...@apache.org> on 2012/06/09 23:51:42 UTC, 0 replies.
- [Nutch Wiki] Update of "NutchTutorial" by SebastianNagel - posted by Apache Wiki <wi...@apache.org> on 2012/06/10 21:31:07 UTC, 0 replies.
- bin/nutch -core - posted by Sebastian Nagel <wa...@googlemail.com> on 2012/06/10 23:24:35 UTC, 3 replies.
- [jira] [Created] (NUTCH-1384) Typo in ParseSegment's run-method - posted by "Matthias Agethle (JIRA)" <ji...@apache.org> on 2012/06/11 08:26:42 UTC, 0 replies.
- [jira] [Updated] (NUTCH-1385) More robust plug-in order properties in "nutch-site.xml" - posted by "Andy Xue (JIRA)" <ji...@apache.org> on 2012/06/11 09:19:43 UTC, 4 replies.
- [jira] [Created] (NUTCH-1385) More robust plug-in order properties in "nutch-site.xml" - posted by "Andy Xue (JIRA)" <ji...@apache.org> on 2012/06/11 09:19:43 UTC, 0 replies.
- [jira] [Updated] (NUTCH-1384) Typo in ParseSegment's run-method - posted by "Markus Jelsma (JIRA)" <ji...@apache.org> on 2012/06/11 09:25:43 UTC, 0 replies.
- [jira] [Commented] (NUTCH-1383) IndexingFiltersChecker to show error message instead of null pointer exception - posted by "Markus Jelsma (JIRA)" <ji...@apache.org> on 2012/06/11 09:25:43 UTC, 0 replies.
- [jira] [Assigned] (NUTCH-1385) More robust plug-in order properties in "nutch-site.xml" - posted by "Markus Jelsma (JIRA)" <ji...@apache.org> on 2012/06/11 11:25:42 UTC, 0 replies.
- [jira] [Assigned] (NUTCH-1384) Typo in ParseSegment's run-method - posted by "Markus Jelsma (JIRA)" <ji...@apache.org> on 2012/06/11 11:27:43 UTC, 0 replies.
- [jira] [Resolved] (NUTCH-1385) More robust plug-in order properties in "nutch-site.xml" - posted by "Markus Jelsma (JIRA)" <ji...@apache.org> on 2012/06/11 11:29:42 UTC, 0 replies.
- [jira] [Resolved] (NUTCH-1384) Typo in ParseSegment's run-method - posted by "Markus Jelsma (JIRA)" <ji...@apache.org> on 2012/06/11 11:31:43 UTC, 0 replies.
- [jira] [Resolved] (NUTCH-1262) Map `duplicating` content-types to a single type - posted by "Markus Jelsma (JIRA)" <ji...@apache.org> on 2012/06/11 12:28:43 UTC, 0 replies.
- [jira] [Updated] (NUTCH-1262) Map `duplicating` content-types to a single type - posted by "Markus Jelsma (JIRA)" <ji...@apache.org> on 2012/06/11 12:32:42 UTC, 0 replies.
- [jira] [Commented] (NUTCH-1385) More robust plug-in order properties in "nutch-site.xml" - posted by "Lewis John McGibbney (JIRA)" <ji...@apache.org> on 2012/06/11 16:15:43 UTC, 2 replies.
- [jira] [Commented] (NUTCH-1384) Typo in ParseSegment's run-method - posted by "Hudson (JIRA)" <ji...@apache.org> on 2012/06/11 22:23:44 UTC, 1 replies.
- [jira] [Resolved] (NUTCH-1360) Suport the storing of IP address connected to when web crawling - posted by "Lewis John McGibbney (JIRA)" <ji...@apache.org> on 2012/06/11 22:29:43 UTC, 0 replies.
- [jira] [Commented] (NUTCH-1360) Suport the storing of IP address connected to when web crawling - posted by "Hudson (JIRA)" <ji...@apache.org> on 2012/06/11 23:09:43 UTC, 1 replies.
- [jira] [Updated] (NUTCH-1364) Add a counter for malformed urls - posted by "Lewis John McGibbney (JIRA)" <ji...@apache.org> on 2012/06/12 00:10:43 UTC, 0 replies.
- [jira] [Commented] (NUTCH-1364) Add a counter for malformed urls - posted by "Markus Jelsma (JIRA)" <ji...@apache.org> on 2012/06/12 00:24:43 UTC, 0 replies.
- [jira] [Updated] (NUTCH-1364) Add a counter in Generator for malformed urls - posted by "Lewis John McGibbney (JIRA)" <ji...@apache.org> on 2012/06/12 02:12:43 UTC, 0 replies.
- [jira] [Resolved] (NUTCH-1364) Add a counter in Generator for malformed urls - posted by "Lewis John McGibbney (JIRA)" <ji...@apache.org> on 2012/06/12 02:14:43 UTC, 0 replies.
- [jira] [Commented] (NUTCH-1364) Add a counter in Generator for malformed urls - posted by "Hudson (JIRA)" <ji...@apache.org> on 2012/06/12 03:14:43 UTC, 1 replies.
- [jira] [Resolved] (NUTCH-1024) Dynamically set fetchInterval by MIME-type - posted by "Markus Jelsma (JIRA)" <ji...@apache.org> on 2012/06/12 12:12:44 UTC, 0 replies.
- [jira] [Resolved] (NUTCH-1352) Improve regex urlfilters/normalizers synchronization - posted by "Markus Jelsma (JIRA)" <ji...@apache.org> on 2012/06/12 12:16:43 UTC, 0 replies.
- [jira] [Resolved] (NUTCH-1356) ParseUtil use ExecutorService instead of manually thread handling. - posted by "Markus Jelsma (JIRA)" <ji...@apache.org> on 2012/06/12 12:18:43 UTC, 0 replies.
- [jira] [Created] (NUTCH-1386) Headings filter not to add empty values - posted by "Markus Jelsma (JIRA)" <ji...@apache.org> on 2012/06/12 12:22:42 UTC, 0 replies.
- [jira] [Resolved] (NUTCH-1386) Headings filter not to add empty values - posted by "Markus Jelsma (JIRA)" <ji...@apache.org> on 2012/06/12 12:22:43 UTC, 0 replies.
- [jira] [Resolved] (NUTCH-1319) HostNormalizer - posted by "Markus Jelsma (JIRA)" <ji...@apache.org> on 2012/06/12 12:34:42 UTC, 0 replies.
- [jira] [Resolved] (NUTCH-1330) OutlinkDB to preserve back up - posted by "Markus Jelsma (JIRA)" <ji...@apache.org> on 2012/06/12 12:43:42 UTC, 0 replies.
- [jira] [Commented] (NUTCH-1330) OutlinkDB to preserve back up - posted by "Markus Jelsma (JIRA)" <ji...@apache.org> on 2012/06/12 12:43:42 UTC, 2 replies.
- [Nutch Wiki] Update of "bin/nutch solrindex" by MarkusJelsma - posted by Apache Wiki <wi...@apache.org> on 2012/06/12 12:57:52 UTC, 0 replies.
- [jira] [Commented] (NUTCH-1356) ParseUtil use ExecutorService instead of manually thread handling. - posted by "Ferdy Galema (JIRA)" <ji...@apache.org> on 2012/06/12 13:08:43 UTC, 2 replies.
- [jira] [Created] (NUTCH-1387) All parsers should respond to cancellation. - posted by "Ferdy Galema (JIRA)" <ji...@apache.org> on 2012/06/12 13:12:43 UTC, 0 replies.
- [jira] [Updated] (NUTCH-1387) All parsers should respond to cancellation / interrupts. - posted by "Ferdy Galema (JIRA)" <ji...@apache.org> on 2012/06/12 13:14:42 UTC, 0 replies.
- [jira] [Resolved] (NUTCH-1300) Indexer to normalize URL's - posted by "Markus Jelsma (JIRA)" <ji...@apache.org> on 2012/06/12 13:28:43 UTC, 0 replies.
- [jira] [Resolved] (NUTCH-1318) Parse time outs crash parsing fetcher - posted by "Markus Jelsma (JIRA)" <ji...@apache.org> on 2012/06/12 13:30:43 UTC, 0 replies.
- [jira] [Commented] (NUTCH-1300) Indexer to normalize URL's - posted by "Hudson (JIRA)" <ji...@apache.org> on 2012/06/12 13:46:43 UTC, 1 replies.
- [jira] [Commented] (NUTCH-1386) Headings filter not to add empty values - posted by "Hudson (JIRA)" <ji...@apache.org> on 2012/06/12 13:46:43 UTC, 1 replies.
- [jira] [Commented] (NUTCH-1319) HostNormalizer - posted by "Hudson (JIRA)" <ji...@apache.org> on 2012/06/12 13:46:44 UTC, 2 replies.
- [jira] [Created] (NUTCH-1388) Optionally maintain custom fetch interval despite AdaptiveFetchSchedule - posted by "Markus Jelsma (JIRA)" <ji...@apache.org> on 2012/06/12 15:07:42 UTC, 0 replies.
- [jira] [Updated] (NUTCH-1388) Optionally maintain custom fetch interval despite AdaptiveFetchSchedule - posted by "Markus Jelsma (JIRA)" <ji...@apache.org> on 2012/06/12 15:09:44 UTC, 2 replies.
- [jira] [Created] (NUTCH-1389) parsechecker and indexchecker to report truncated content - posted by "Sebastian Nagel (JIRA)" <ji...@apache.org> on 2012/06/12 22:47:42 UTC, 0 replies.
- [jira] [Updated] (NUTCH-1389) parsechecker and indexchecker to report truncated content - posted by "Lewis John McGibbney (JIRA)" <ji...@apache.org> on 2012/06/12 22:53:43 UTC, 0 replies.
- Nutch and IPv6 - posted by Lewis John Mcgibbney <le...@gmail.com> on 2012/06/13 00:01:05 UTC, 0 replies.
- [Nutch Wiki] Trivial Update of "GORA_HBase" by LewisJohnMcgibbney - posted by Apache Wiki <wi...@apache.org> on 2012/06/13 12:57:43 UTC, 2 replies.
- [Nutch Wiki] Trivial Update of "FrontPage" by LewisJohnMcgibbney - posted by Apache Wiki <wi...@apache.org> on 2012/06/13 13:05:18 UTC, 4 replies.
- [Nutch Wiki] Trivial Update of "Nutch2Tutorial" by LewisJohnMcgibbney - posted by Apache Wiki <wi...@apache.org> on 2012/06/13 13:05:42 UTC, 3 replies.
- Suitable Nutch 2.0 Project Description - posted by Lewis John Mcgibbney <le...@gmail.com> on 2012/06/13 14:29:29 UTC, 3 replies.
- [jira] [Created] (NUTCH-1390) readdb -url $url throws NPE with gora-cassandra - posted by "Lewis John McGibbney (JIRA)" <ji...@apache.org> on 2012/06/14 01:25:42 UTC, 0 replies.
- [jira] [Created] (NUTCH-1391) readdb -stats fires java.io.EOFException - posted by "Lewis John McGibbney (JIRA)" <ji...@apache.org> on 2012/06/14 01:31:43 UTC, 0 replies.
- [jira] [Created] (NUTCH-1392) -force and -resume arguments being ignored in ParserJob - posted by "Lewis John McGibbney (JIRA)" <ji...@apache.org> on 2012/06/14 01:35:42 UTC, 0 replies.
- [jira] [Created] (NUTCH-1393) Display consistent usage of GeneratorJob with 1.X - posted by "Lewis John McGibbney (JIRA)" <ji...@apache.org> on 2012/06/14 01:39:43 UTC, 0 replies.
- [jira] [Created] (NUTCH-1394) backport NUTCH-1232 Remove host field from index-basic - posted by "Lewis John McGibbney (JIRA)" <ji...@apache.org> on 2012/06/14 01:43:42 UTC, 0 replies.
- [jira] [Commented] (NUTCH-1392) -force and -resume arguments being ignored in ParserJob - posted by "Lewis John McGibbney (JIRA)" <ji...@apache.org> on 2012/06/14 01:49:42 UTC, 2 replies.
- [jira] [Updated] (NUTCH-1394) backport NUTCH-1232 Remove site field from index-basic - posted by "Markus Jelsma (JIRA)" <ji...@apache.org> on 2012/06/14 09:12:43 UTC, 0 replies.
- [jira] [Updated] (NUTCH-1392) -force and -resume arguments being ignored in ParserJob - posted by "Ferdy Galema (JIRA)" <ji...@apache.org> on 2012/06/14 09:20:42 UTC, 0 replies.
- [jira] [Created] (NUTCH-1395) Show batchId when skipping within ParserJob - posted by "Lewis John McGibbney (JIRA)" <ji...@apache.org> on 2012/06/14 14:29:42 UTC, 0 replies.
- [jira] [Resolved] (NUTCH-1392) -force and -resume arguments being ignored in ParserJob - posted by "Lewis John McGibbney (JIRA)" <ji...@apache.org> on 2012/06/14 14:35:42 UTC, 0 replies.
- [jira] [Created] (NUTCH-1396) Upgrade to Tika 1.1 - posted by "Julien Nioche (JIRA)" <ji...@apache.org> on 2012/06/15 11:49:43 UTC, 0 replies.
- [jira] [Updated] (NUTCH-1396) Upgrade to Tika 1.1 - posted by "Julien Nioche (JIRA)" <ji...@apache.org> on 2012/06/15 11:53:42 UTC, 0 replies.
- [jira] [Resolved] (NUTCH-1396) Upgrade to Tika 1.1 - posted by "Lewis John McGibbney (JIRA)" <ji...@apache.org> on 2012/06/15 13:33:42 UTC, 0 replies.
- [VOTE] Apache Nutch 2.0 RC2 - posted by lewis john mcgibbney <le...@apache.org> on 2012/06/15 14:48:52 UTC, 5 replies.
- [jira] [Created] (NUTCH-1397) language-identifier incorrectly handles double-barreled language properties - posted by "Lewis John McGibbney (JIRA)" <ji...@apache.org> on 2012/06/15 15:59:42 UTC, 0 replies.
- [jira] [Closed] (NUTCH-1396) Upgrade to Tika 1.1 - posted by "Julien Nioche (JIRA)" <ji...@apache.org> on 2012/06/15 16:00:50 UTC, 0 replies.
- [jira] [Commented] (NUTCH-1081) ant tests fail - posted by "Julien Nioche (JIRA)" <ji...@apache.org> on 2012/06/15 16:04:43 UTC, 1 replies.
- [jira] [Created] (NUTCH-1398) Upgrade to Hadoop 1.0.3 - posted by "Julien Nioche (JIRA)" <ji...@apache.org> on 2012/06/15 16:06:42 UTC, 0 replies.
- [jira] [Commented] (NUTCH-1398) Upgrade to Hadoop 1.0.3 - posted by "Julien Nioche (JIRA)" <ji...@apache.org> on 2012/06/15 16:08:42 UTC, 3 replies.
- [jira] [Commented] (NUTCH-1397) language-identifier incorrectly handles double-barreled language properties - posted by "Ken Krugler (JIRA)" <ji...@apache.org> on 2012/06/15 16:22:42 UTC, 3 replies.
- [jira] [Comment Edited] (NUTCH-1397) language-identifier incorrectly handles double-barreled language properties - posted by "Lewis John McGibbney (JIRA)" <ji...@apache.org> on 2012/06/15 17:54:42 UTC, 0 replies.
- [jira] [Closed] (NUTCH-1081) ant tests fail - posted by "Lewis John McGibbney (JIRA)" <ji...@apache.org> on 2012/06/15 17:58:43 UTC, 0 replies.
- [jira] [Resolved] (NUTCH-1081) ant tests fail - posted by "Lewis John McGibbney (JIRA)" <ji...@apache.org> on 2012/06/15 17:58:43 UTC, 0 replies.
- [jira] [Commented] (NUTCH-1396) Upgrade to Tika 1.1 - posted by "Hudson (JIRA)" <ji...@apache.org> on 2012/06/15 18:59:42 UTC, 0 replies.
- [jira] [Created] (NUTCH-1399) TestProtocolHttpClient fails - posted by "Julien Nioche (JIRA)" <ji...@apache.org> on 2012/06/17 19:04:42 UTC, 0 replies.
- [jira] [Updated] (NUTCH-1399) TestProtocolHttpClient fails - posted by "Julien Nioche (JIRA)" <ji...@apache.org> on 2012/06/17 19:06:42 UTC, 0 replies.
- Build failed in Jenkins: Nutch-nutchgora #284 - posted by Apache Jenkins Server <je...@builds.apache.org> on 2012/06/18 06:05:22 UTC, 0 replies.
- Build failed in Jenkins: Nutch-trunk #1872 - posted by Apache Jenkins Server <je...@builds.apache.org> on 2012/06/18 06:05:30 UTC, 0 replies.
- Build failed in Jenkins: nutch-trunk-maven #317 - posted by Apache Jenkins Server <je...@builds.apache.org> on 2012/06/18 07:02:53 UTC, 0 replies.
- Jenkins build is back to normal : Nutch-nutchgora #285 - posted by Apache Jenkins Server <je...@builds.apache.org> on 2012/06/19 06:23:00 UTC, 0 replies.
- Jenkins build is back to normal : Nutch-trunk #1873 - posted by Apache Jenkins Server <je...@builds.apache.org> on 2012/06/19 06:34:15 UTC, 0 replies.
- Jenkins build is back to normal : nutch-trunk-maven #318 - posted by Apache Jenkins Server <je...@builds.apache.org> on 2012/06/19 07:04:40 UTC, 0 replies.
- [jira] [Created] (NUTCH-1400) Remove developer -core option for bin/nutch - posted by "Lewis John McGibbney (JIRA)" <ji...@apache.org> on 2012/06/19 10:50:43 UTC, 0 replies.
- Fwd: Nutch 1.5 Deploy Mode Doesn't Work like Nutch 1.4 Deploy Mode - posted by Julien Nioche <li...@gmail.com> on 2012/06/19 11:26:29 UTC, 6 replies.
- [jira] [Created] (NUTCH-1401) Upgrade to Hadoop 1.0.3 - posted by "Julien Nioche (JIRA)" <ji...@apache.org> on 2012/06/19 14:46:43 UTC, 0 replies.
- [jira] [Created] (NUTCH-1402) Create AbstractScoringFilter - posted by "Julien Nioche (JIRA)" <ji...@apache.org> on 2012/06/19 14:50:42 UTC, 0 replies.
- [jira] [Created] (NUTCH-1403) Add default ScoringFilter for manipulating metadata - posted by "Julien Nioche (JIRA)" <ji...@apache.org> on 2012/06/19 14:56:42 UTC, 0 replies.
- [jira] [Updated] (NUTCH-1401) Upgrade to Hadoop 1.0.3 - posted by "Julien Nioche (JIRA)" <ji...@apache.org> on 2012/06/19 15:00:46 UTC, 1 replies.
- [jira] [Updated] (NUTCH-1400) Remove developer -core option for bin/nutch - posted by "Julien Nioche (JIRA)" <ji...@apache.org> on 2012/06/19 15:00:46 UTC, 0 replies.
- [jira] [Created] (NUTCH-1404) Nutch script fails to find job file in deploy mode - posted by "Julien Nioche (JIRA)" <ji...@apache.org> on 2012/06/19 15:09:42 UTC, 0 replies.
- [jira] [Updated] (NUTCH-1398) Upgrade to Hadoop 1.0.3 - posted by "Julien Nioche (JIRA)" <ji...@apache.org> on 2012/06/19 15:11:44 UTC, 1 replies.
- [jira] [Resolved] (NUTCH-1398) Upgrade to Hadoop 1.0.3 - posted by "Julien Nioche (JIRA)" <ji...@apache.org> on 2012/06/19 15:11:45 UTC, 0 replies.
- [jira] [Resolved] (NUTCH-1401) Upgrade to Hadoop 1.0.3 - posted by "Julien Nioche (JIRA)" <ji...@apache.org> on 2012/06/19 15:15:43 UTC, 0 replies.
- [jira] [Resolved] (NUTCH-1404) Nutch script fails to find job file in deploy mode - posted by "Julien Nioche (JIRA)" <ji...@apache.org> on 2012/06/19 15:25:42 UTC, 0 replies.
- 1.5.1 and 2.0 RC3 [Was: Nutch 1.5 Deploy Mode Doesn't Work like Nutch 1.4 Deploy Mode] - posted by Julien Nioche <li...@gmail.com> on 2012/06/19 15:28:39 UTC, 2 replies.
- [jira] [Commented] (NUTCH-1404) Nutch script fails to find job file in deploy mode - posted by "Hudson (JIRA)" <ji...@apache.org> on 2012/06/19 16:08:42 UTC, 2 replies.
- [jira] [Resolved] (NUTCH-1399) TestProtocolHttpClient fails - posted by "Lewis John McGibbney (JIRA)" <ji...@apache.org> on 2012/06/19 16:49:43 UTC, 0 replies.
- [Nutch Wiki] Update of "DebugTool" by SebastianNagel - posted by Apache Wiki <wi...@apache.org> on 2012/06/19 22:51:16 UTC, 0 replies.
- [jira] [Commented] (NUTCH-1401) Upgrade to Hadoop 1.0.3 - posted by "Hudson (JIRA)" <ji...@apache.org> on 2012/06/20 06:22:43 UTC, 0 replies.
- [jira] [Commented] (NUTCH-1399) TestProtocolHttpClient fails - posted by "Hudson (JIRA)" <ji...@apache.org> on 2012/06/20 06:22:43 UTC, 0 replies.
- [jira] [Created] (NUTCH-1405) Allow to overwrite CrawlDatum's with injected entries - posted by "Markus Jelsma (JIRA)" <ji...@apache.org> on 2012/06/20 11:15:42 UTC, 0 replies.
- [jira] [Resolved] (NUTCH-1400) Remove developer -core option for bin/nutch - posted by "Julien Nioche (JIRA)" <ji...@apache.org> on 2012/06/20 11:31:45 UTC, 0 replies.
- [jira] [Commented] (NUTCH-1400) Remove developer -core option for bin/nutch - posted by "Hudson (JIRA)" <ji...@apache.org> on 2012/06/20 12:08:42 UTC, 2 replies.
- [jira] [Updated] (NUTCH-1391) readdb -stats fires java.io.EOFException - posted by "Julien Nioche (JIRA)" <ji...@apache.org> on 2012/06/20 12:20:43 UTC, 0 replies.
- [jira] [Commented] (NUTCH-1391) readdb -stats fires java.io.EOFException - posted by "Julien Nioche (JIRA)" <ji...@apache.org> on 2012/06/20 12:30:43 UTC, 1 replies.
- [jira] [Resolved] (NUTCH-1391) readdb -stats fires java.io.EOFException - posted by "Julien Nioche (JIRA)" <ji...@apache.org> on 2012/06/20 12:36:42 UTC, 0 replies.
- [Nutch Wiki] Trivial Update of "AdminGroup" by LewisJohnMcgibbney - posted by Apache Wiki <wi...@apache.org> on 2012/06/20 12:50:46 UTC, 0 replies.
- [Nutch Wiki] Update of "Nutch2Tutorial" by JulienNioche - posted by Apache Wiki <wi...@apache.org> on 2012/06/20 12:55:21 UTC, 0 replies.
- [Nutch Wiki] Update of "GORA_HBase" by FerdyGalema - posted by Apache Wiki <wi...@apache.org> on 2012/06/20 14:49:16 UTC, 0 replies.
- [Nutch Wiki] Update of "Nutch2Tutorial" by FerdyGalema - posted by Apache Wiki <wi...@apache.org> on 2012/06/20 15:02:09 UTC, 0 replies.
- [Nutch Wiki] Update of "DebugTool" by LewisJohnMcgibbney - posted by Apache Wiki <wi...@apache.org> on 2012/06/20 15:06:19 UTC, 0 replies.
- [jira] [Created] (NUTCH-1406) Metatags-index/-parse plugin: conversion to Solr date format and prevents parsing/indexing of empty tags - posted by "Kristof (JIRA)" <ji...@apache.org> on 2012/06/20 23:35:42 UTC, 0 replies.
- [jira] [Updated] (NUTCH-1406) Metatags-index/-parse plugin: conversion to Solr date format and prevents parsing/indexing of empty tags - posted by "Kristof (JIRA)" <ji...@apache.org> on 2012/06/20 23:37:42 UTC, 10 replies.
- [jira] [Commented] (NUTCH-1406) Metatags-index/-parse plugin: conversion to Solr date format and prevents parsing/indexing of empty tags - posted by "Markus Jelsma (JIRA)" <ji...@apache.org> on 2012/06/21 00:33:43 UTC, 4 replies.
- [jira] [Commented] (NUTCH-1031) Delegate parsing of robots.txt to crawler-commons - posted by "Lewis John McGibbney (JIRA)" <ji...@apache.org> on 2012/06/21 12:48:42 UTC, 1 replies.
- [jira] [Closed] (NUTCH-1008) Switch to crawler-commons version of robots.txt parsing code - posted by "Markus Jelsma (JIRA)" <ji...@apache.org> on 2012/06/21 13:13:42 UTC, 0 replies.
- [jira] [Commented] (NUTCH-1388) Optionally maintain custom fetch interval despite AdaptiveFetchSchedule - posted by "Markus Jelsma (JIRA)" <ji...@apache.org> on 2012/06/21 13:13:42 UTC, 1 replies.
- [jira] [Updated] (NUTCH-1407) BasicIndexingFilter to optionally add domain field - posted by "Markus Jelsma (JIRA)" <ji...@apache.org> on 2012/06/21 15:00:43 UTC, 0 replies.
- [jira] [Created] (NUTCH-1407) BasicIndexingFilter to optionally add domain field - posted by "Markus Jelsma (JIRA)" <ji...@apache.org> on 2012/06/21 15:00:43 UTC, 0 replies.
- [jira] [Commented] (NUTCH-1407) BasicIndexingFilter to optionally add domain field - posted by "Lewis John McGibbney (JIRA)" <ji...@apache.org> on 2012/06/21 22:02:42 UTC, 4 replies.
- Nutch 2.0 Press Announcement - posted by Lewis John Mcgibbney <le...@gmail.com> on 2012/06/21 22:49:00 UTC, 1 replies.
- re: 1.5.1 release - posted by Markus Jelsma <ma...@openindex.io> on 2012/06/21 23:02:59 UTC, 2 replies.
- [jira] [Updated] (NUTCH-1405) Allow to overwrite CrawlDatum's with injected entries - posted by "Markus Jelsma (JIRA)" <ji...@apache.org> on 2012/06/22 09:18:42 UTC, 8 replies.
- [jira] [Updated] (NUTCH-1342) Read time out protocol-http - posted by "Markus Jelsma (JIRA)" <ji...@apache.org> on 2012/06/22 09:18:42 UTC, 0 replies.
- [jira] [Commented] (NUTCH-1406) metadata-index plugin: conversion to Solr date format - posted by "Kristof (JIRA)" <ji...@apache.org> on 2012/06/22 10:58:42 UTC, 2 replies.
- [jira] [Updated] (NUTCH-1406) metadata-index plugin: conversion to Solr date format - posted by "Kristof (JIRA)" <ji...@apache.org> on 2012/06/22 10:58:42 UTC, 6 replies.
- [jira] [Created] (NUTCH-1408) RobotRulesParser main doesn't take URL's - posted by "Markus Jelsma (JIRA)" <ji...@apache.org> on 2012/06/22 13:40:43 UTC, 0 replies.
- [jira] [Updated] (NUTCH-1408) RobotRulesParser main doesn't take URL's - posted by "Markus Jelsma (JIRA)" <ji...@apache.org> on 2012/06/22 13:42:42 UTC, 0 replies.
- [jira] [Created] (NUTCH-1409) Remove deprecated properties in nutch-default.xml - posted by "Matthias Agethle (JIRA)" <ji...@apache.org> on 2012/06/22 16:09:42 UTC, 0 replies.
- [jira] [Updated] (NUTCH-1409) Remove deprecated properties in nutch-default.xml - posted by "Matthias Agethle (JIRA)" <ji...@apache.org> on 2012/06/22 16:15:42 UTC, 1 replies.
- [jira] [Commented] (NUTCH-1408) RobotRulesParser main doesn't take URL's - posted by "Lewis John McGibbney (JIRA)" <ji...@apache.org> on 2012/06/22 16:49:42 UTC, 2 replies.
- Build failed in Jenkins: Nutch-nutchgora #289 - posted by Apache Jenkins Server <je...@builds.apache.org> on 2012/06/23 06:05:03 UTC, 0 replies.
- Build failed in Jenkins: Nutch-trunk #1877 - posted by Apache Jenkins Server <je...@builds.apache.org> on 2012/06/23 06:06:15 UTC, 0 replies.
- [jira] [Updated] (NUTCH-1406) index-metadata plugin: conversion to Solr date format - posted by "Kristof (JIRA)" <ji...@apache.org> on 2012/06/23 11:09:42 UTC, 0 replies.
- Build failed in Jenkins: nutch-trunk-maven #325 - posted by Apache Jenkins Server <je...@builds.apache.org> on 2012/06/23 15:03:37 UTC, 0 replies.
- Jenkins build is back to normal : Nutch-nutchgora #290 - posted by Apache Jenkins Server <je...@builds.apache.org> on 2012/06/24 06:17:51 UTC, 0 replies.
- Build failed in Jenkins: Nutch-trunk #1878 - posted by Apache Jenkins Server <je...@builds.apache.org> on 2012/06/24 06:27:57 UTC, 0 replies.
- Jenkins build is back to normal : nutch-trunk-maven #326 - posted by Apache Jenkins Server <je...@builds.apache.org> on 2012/06/24 07:02:58 UTC, 0 replies.
- [jira] [Assigned] (NUTCH-1251) Deletion of duplicates fails with org.apache.solr.client.solrj.SolrServerException - posted by "Markus Jelsma (JIRA)" <ji...@apache.org> on 2012/06/24 14:50:42 UTC, 0 replies.
- Build failed in Jenkins: Nutch-trunk #1879 - posted by Apache Jenkins Server <je...@builds.apache.org> on 2012/06/25 06:30:24 UTC, 0 replies.
- [jira] [Resolved] (NUTCH-1408) RobotRulesParser main doesn't take URL's - posted by "Markus Jelsma (JIRA)" <ji...@apache.org> on 2012/06/25 16:43:44 UTC, 0 replies.
- [jira] [Resolved] (NUTCH-1407) BasicIndexingFilter to optionally add domain field - posted by "Markus Jelsma (JIRA)" <ji...@apache.org> on 2012/06/25 16:49:42 UTC, 0 replies.
- [jira] [Updated] (NUTCH-1251) SolrDedup to use proper Lucene catch-all query - posted by "Markus Jelsma (JIRA)" <ji...@apache.org> on 2012/06/25 17:14:42 UTC, 0 replies.
- [VOTE] Apache Nutch 1.5.1 Release Candidate - posted by Lewis John Mcgibbney <le...@gmail.com> on 2012/06/25 18:13:53 UTC, 7 replies.
- [VOTE] Apache Nutch 2.0 Release Candidate #3 - posted by Lewis John Mcgibbney <le...@gmail.com> on 2012/06/25 18:32:58 UTC, 1 replies.
- Build failed in Jenkins: Nutch-trunk #1880 - posted by Apache Jenkins Server <je...@builds.apache.org> on 2012/06/26 06:27:37 UTC, 0 replies.
- [jira] [Resolved] (NUTCH-1251) SolrDedup to use proper Lucene catch-all query - posted by "Markus Jelsma (JIRA)" <ji...@apache.org> on 2012/06/26 10:22:43 UTC, 0 replies.
- [jira] [Updated] (NUTCH-1100) SolrDedup broken - posted by "Markus Jelsma (JIRA)" <ji...@apache.org> on 2012/06/26 10:29:44 UTC, 0 replies.
- [jira] [Updated] (NUTCH-1233) Rely on Tika for outlink extraction - posted by "Markus Jelsma (JIRA)" <ji...@apache.org> on 2012/06/26 10:55:42 UTC, 0 replies.
- [jira] [Commented] (NUTCH-1251) SolrDedup to use proper Lucene catch-all query - posted by "Hudson (JIRA)" <ji...@apache.org> on 2012/06/26 11:27:44 UTC, 0 replies.
- [jira] [Commented] (NUTCH-1405) Allow to overwrite CrawlDatum's with injected entries - posted by "Markus Jelsma (JIRA)" <ji...@apache.org> on 2012/06/26 17:07:43 UTC, 9 replies.
- ant build: central list of plugins - posted by Sebastian Nagel <wa...@googlemail.com> on 2012/06/26 23:34:22 UTC, 1 replies.
- Build failed in Jenkins: Nutch-nutchgora #293 - posted by Apache Jenkins Server <je...@builds.apache.org> on 2012/06/27 06:04:24 UTC, 0 replies.
- Build failed in Jenkins: Nutch-trunk #1881 - posted by Apache Jenkins Server <je...@builds.apache.org> on 2012/06/27 06:05:24 UTC, 0 replies.
- [Nutch Wiki] Update of "Nutch2Crawling" by FerdyGalema - posted by Apache Wiki <wi...@apache.org> on 2012/06/27 09:31:11 UTC, 2 replies.
- [Nutch Wiki] Update of "FrontPage" by FerdyGalema - posted by Apache Wiki <wi...@apache.org> on 2012/06/27 09:48:26 UTC, 0 replies.
- [Nutch Wiki] Update of "Nutch2Architecture" by FerdyGalema - posted by Apache Wiki <wi...@apache.org> on 2012/06/27 09:53:15 UTC, 0 replies.
- [jira] [Created] (NUTCH-1410) impact of a map-reduce problem - posted by "behnam nikbakht (JIRA)" <ji...@apache.org> on 2012/06/27 10:17:42 UTC, 0 replies.
- [jira] [Created] (NUTCH-1411) nutchgora fetcher.store.content does not work - posted by "Ferdy Galema (JIRA)" <ji...@apache.org> on 2012/06/27 10:20:42 UTC, 0 replies.
- [Nutch Wiki] Trivial Update of "Nutch2Crawling" by LewisJohnMcgibbney - posted by Apache Wiki <wi...@apache.org> on 2012/06/27 20:39:00 UTC, 0 replies.
- Nutch Author, Publication, and Religion Detection - posted by JAB <ge...@baesystems.com> on 2012/06/27 20:59:45 UTC, 2 replies.
- Jenkins build is back to normal : Nutch-nutchgora #294 - posted by Apache Jenkins Server <je...@builds.apache.org> on 2012/06/28 06:17:38 UTC, 0 replies.
- Jenkins build is back to normal : Nutch-trunk #1882 - posted by Apache Jenkins Server <je...@builds.apache.org> on 2012/06/28 06:28:18 UTC, 0 replies.
- [jira] [Created] (NUTCH-1412) Upgrade commons lang - posted by "Markus Jelsma (JIRA)" <ji...@apache.org> on 2012/06/28 14:28:44 UTC, 0 replies.
- [jira] [Updated] (NUTCH-1412) Upgrade commons lang - posted by "Markus Jelsma (JIRA)" <ji...@apache.org> on 2012/06/28 14:43:42 UTC, 0 replies.
- [jira] [Resolved] (NUTCH-1412) Upgrade commons lang - posted by "Markus Jelsma (JIRA)" <ji...@apache.org> on 2012/06/28 17:40:45 UTC, 0 replies.
- [jira] [Commented] (NUTCH-1412) Upgrade commons lang - posted by "Hudson (JIRA)" <ji...@apache.org> on 2012/06/28 18:07:44 UTC, 1 replies.
- [jira] [Created] (NUTCH-1413) Fetcher to record response time - posted by "Markus Jelsma (JIRA)" <ji...@apache.org> on 2012/06/28 19:30:44 UTC, 0 replies.
- [jira] [Created] (NUTCH-1414) Date extraction parse filter - posted by "Markus Jelsma (JIRA)" <ji...@apache.org> on 2012/06/28 19:32:42 UTC, 0 replies.
- o.a.n.metadata.Office still required? - posted by Lewis John Mcgibbney <le...@gmail.com> on 2012/06/28 23:37:50 UTC, 2 replies.
- Build failed in Jenkins: Nutch-nutchgora #296 - posted by Apache Jenkins Server <je...@builds.apache.org> on 2012/06/30 06:05:23 UTC, 0 replies.
- Build failed in Jenkins: Nutch-trunk #1884 - posted by Apache Jenkins Server <je...@builds.apache.org> on 2012/06/30 06:06:34 UTC, 0 replies.
- Build failed in Jenkins: nutch-trunk-maven #336 - posted by Apache Jenkins Server <je...@builds.apache.org> on 2012/06/30 07:01:52 UTC, 0 replies.
- [jira] [Created] (NUTCH-1415) release packages to contain top level folder apache-nutch-x.x - posted by "Sebastian Nagel (JIRA)" <ji...@apache.org> on 2012/06/30 15:35:42 UTC, 0 replies.
- [jira] [Updated] (NUTCH-1415) release packages to contain top level folder apache-nutch-x.x - posted by "Sebastian Nagel (JIRA)" <ji...@apache.org> on 2012/06/30 15:41:42 UTC, 0 replies.