You are viewing a plain text version of this content. The canonical link for it is here.
- [jira] Updated: (NUTCH-624) Better parsed text by default parser - posted by "Vinci (JIRA)" <ji...@apache.org> on 2008/04/01 11:08:27 UTC, 0 replies.
- [jira] Updated: (NUTCH-625) Non-ascii character broken in dumped content for mixed encoding (utf-8 and multi-byte) - posted by "Vinci (JIRA)" <ji...@apache.org> on 2008/04/01 11:20:31 UTC, 0 replies.
- Re: [jira] Created: (NUTCH-624) Better parsed text - posted by Vinci <vi...@polyu.edu.hk> on 2008/04/01 11:21:30 UTC, 0 replies.
- Is there any LSI implementation? - posted by "Edward J. Yoon" <ed...@udanax.org> on 2008/04/02 03:55:01 UTC, 1 replies.
- Build failed in Hudson: Nutch-trunk #408 - posted by Apache Hudson Server <hu...@hudson.zones.apache.org> on 2008/04/02 08:59:40 UTC, 0 replies.
- Build failed in Hudson: Nutch-trunk #409 - posted by Apache Hudson Server <hu...@hudson.zones.apache.org> on 2008/04/03 09:05:32 UTC, 0 replies.
- Hudson build is back to normal: Nutch-trunk #410 - posted by Apache Hudson Server <hu...@hudson.zones.apache.org> on 2008/04/04 08:55:45 UTC, 0 replies.
- Build failed in Hudson: Nutch-trunk #411 - posted by Apache Hudson Server <hu...@hudson.zones.apache.org> on 2008/04/05 08:06:45 UTC, 0 replies.
- Hudson build is back to normal: Nutch-trunk #412 - posted by Apache Hudson Server <hu...@hudson.zones.apache.org> on 2008/04/06 06:18:14 UTC, 0 replies.
- [jira] Created: (NUTCH-626) fetcher2 breaks out the domain with db.ignore.external.links set at cross domain redirects - posted by "Remco Verhoef (JIRA)" <ji...@apache.org> on 2008/04/06 23:18:24 UTC, 0 replies.
- [jira] Updated: (NUTCH-626) fetcher2 breaks out the domain with db.ignore.external.links set at cross domain redirects - posted by "Remco Verhoef (JIRA)" <ji...@apache.org> on 2008/04/06 23:20:25 UTC, 0 replies.
- Build failed in Hudson: Nutch-trunk #413 - posted by Apache Hudson Server <hu...@hudson.zones.apache.org> on 2008/04/07 09:07:49 UTC, 0 replies.
- Hudson build is back to normal: Nutch-trunk #414 - posted by Apache Hudson Server <hu...@hudson.zones.apache.org> on 2008/04/08 06:28:13 UTC, 0 replies.
- found a bug in plugin/protocol-http - posted by cybercouf <cy...@free.fr> on 2008/04/08 17:08:03 UTC, 0 replies.
- what is the difference between nutch and some other opensource search engines - posted by minskv <mi...@sohu.com> on 2008/04/09 20:44:51 UTC, 1 replies.
- [jira] Created: (NUTCH-627) Minimize host address lookup - posted by "Otis Gospodnetic (JIRA)" <ji...@apache.org> on 2008/04/10 06:12:04 UTC, 0 replies.
- [jira] Updated: (NUTCH-627) Minimize host address lookup - posted by "Otis Gospodnetic (JIRA)" <ji...@apache.org> on 2008/04/10 06:12:04 UTC, 4 replies.
- [jira] Commented: (NUTCH-500) Add hadoop masters configuration file into conf folder - posted by "Hudson (JIRA)" <ji...@apache.org> on 2008/04/10 06:12:04 UTC, 0 replies.
- Hudson build is back to normal: Nutch-trunk #416 - posted by Apache Hudson Server <hu...@hudson.zones.apache.org> on 2008/04/10 06:13:51 UTC, 0 replies.
- [jira] Closed: (NUTCH-500) Add hadoop masters configuration file into conf folder - posted by "Dennis Kubes (JIRA)" <ji...@apache.org> on 2008/04/10 17:24:05 UTC, 0 replies.
- [jira] Commented: (NUTCH-570) Improvement of URL Ordering in Generator.java - posted by "Otis Gospodnetic (JIRA)" <ji...@apache.org> on 2008/04/10 23:08:05 UTC, 0 replies.
- Fetcher2 Reduce Phase Question - posted by Sandeep Tata <sa...@gmail.com> on 2008/04/11 23:25:31 UTC, 1 replies.
- Keywords in documents - posted by Amit Kumar Verma <40...@infosys.com> on 2008/04/12 00:35:57 UTC, 1 replies.
- [jira] Created: (NUTCH-628) Host database to keep track of host-level information - posted by "Otis Gospodnetic (JIRA)" <ji...@apache.org> on 2008/04/12 06:21:05 UTC, 0 replies.
- [jira] Created: (NUTCH-629) Detect slow and timeout servers and drop their URLs - posted by "Otis Gospodnetic (JIRA)" <ji...@apache.org> on 2008/04/12 09:07:04 UTC, 0 replies.
- [jira] Updated: (NUTCH-629) Detect slow and timeout servers and drop their URLs - posted by "Otis Gospodnetic (JIRA)" <ji...@apache.org> on 2008/04/12 09:09:07 UTC, 0 replies.
- Wiki -> email -> nutch-dev? - posted by og...@yahoo.com on 2008/04/13 05:55:58 UTC, 4 replies.
- [jira] Commented: (NUTCH-629) Detect slow and timeout servers and drop their URLs - posted by "Otis Gospodnetic (JIRA)" <ji...@apache.org> on 2008/04/14 21:47:04 UTC, 0 replies.
- [jira] Commented: (NUTCH-442) Integrate Solr/Nutch - posted by "Otis Gospodnetic (JIRA)" <ji...@apache.org> on 2008/04/14 22:23:05 UTC, 2 replies.
- [jira] Updated: (NUTCH-628) Host database to keep track of host-level information - posted by "Otis Gospodnetic (JIRA)" <ji...@apache.org> on 2008/04/15 18:17:06 UTC, 1 replies.
- [jira] Issue Comment Edited: (NUTCH-628) Host database to keep track of host-level information - posted by "Otis Gospodnetic (JIRA)" <ji...@apache.org> on 2008/04/17 07:41:21 UTC, 2 replies.
- [jira] Commented: (NUTCH-628) Host database to keep track of host-level information - posted by "Andrzej Bialecki (JIRA)" <ji...@apache.org> on 2008/04/17 10:11:21 UTC, 8 replies.
- [jira] Updated: (NUTCH-442) Integrate Solr/Nutch - posted by "Doğacan Güney (JIRA)" <ji...@apache.org> on 2008/04/17 16:15:22 UTC, 0 replies.
- [jira] Updated: (NUTCH-596) ParseSegments parse content even if its not CrawlDatum.STATUS_FETCH_SUCCESS - posted by "Doğacan Güney (JIRA)" <ji...@apache.org> on 2008/04/17 17:03:21 UTC, 2 replies.
- [jira] Assigned: (NUTCH-596) ParseSegments parse content even if its not CrawlDatum.STATUS_FETCH_SUCCESS - posted by "Doğacan Güney (JIRA)" <ji...@apache.org> on 2008/04/17 17:03:21 UTC, 0 replies.
- [jira] Commented: (NUTCH-596) ParseSegments parse content even if its not CrawlDatum.STATUS_FETCH_SUCCESS - posted by "Otis Gospodnetic (JIRA)" <ji...@apache.org> on 2008/04/18 17:32:24 UTC, 2 replies.
- [jira] Resolved: (NUTCH-596) ParseSegments parse content even if its not CrawlDatum.STATUS_FETCH_SUCCESS - posted by "Doğacan Güney (JIRA)" <ji...@apache.org> on 2008/04/18 20:50:23 UTC, 0 replies.
- [jira] Closed: (NUTCH-596) ParseSegments parse content even if its not CrawlDatum.STATUS_FETCH_SUCCESS - posted by "Doğacan Güney (JIRA)" <ji...@apache.org> on 2008/04/18 20:52:21 UTC, 0 replies.
- Fw: [jira] Closed: (INFRA-1583) Wiki => email not working for Nutch wiki - posted by og...@yahoo.com on 2008/04/19 21:49:39 UTC, 0 replies.
- Re: Fetching inefficiency - posted by og...@yahoo.com on 2008/04/21 22:40:04 UTC, 1 replies.
- [Nutch Wiki] Update of "GettingNutchRunningWithDebian" by StevenHayles - posted by Apache Wiki <wi...@apache.org> on 2008/04/22 15:12:55 UTC, 0 replies.
- [Nutch Wiki] Update of "FetchCycleOverlap" by OtisGospodnetic - posted by Apache Wiki <wi...@apache.org> on 2008/04/23 07:58:42 UTC, 1 replies.
- [Nutch Wiki] Update of "Nutch2Architecture" by DennisKubes - posted by Apache Wiki <wi...@apache.org> on 2008/04/24 21:42:29 UTC, 1 replies.
- Nutch 2 Architecture - posted by in...@web2seo.com on 2008/04/25 03:22:05 UTC, 1 replies.
- [jira] Created: (NUTCH-630) Error caused by index-more plugin in the latest svn revision - 652259 - posted by "taknev ivrok (JIRA)" <ji...@apache.org> on 2008/04/30 17:21:56 UTC, 0 replies.