You are viewing a plain text version of this content. The canonical link for it is here.
- [jira] [Commented] (NUTCH-1039) Fetcher fails for pages without content-length header - posted by "Ferdy (JIRA)" <ji...@apache.org> on 2011/09/01 10:30:09 UTC, 0 replies.
- [jira] [Assigned] (NUTCH-937) When nutch is run on hadoop > 0.20.2 (or cdh) it will not find plugins because MapReduce will not unpack plugin/ directory from the job's pack (due to MAPREDUCE-967) - posted by "Julien Nioche (JIRA)" <ji...@apache.org> on 2011/09/01 13:00:10 UTC, 0 replies.
- [jira] [Updated] (NUTCH-937) When nutch is run on hadoop > 0.20.2 (or cdh) it will not find plugins because MapReduce will not unpack plugin/ directory from the job's pack (due to MAPREDUCE-967) - posted by "Julien Nioche (JIRA)" <ji...@apache.org> on 2011/09/01 13:02:10 UTC, 0 replies.
- [jira] [Commented] (NUTCH-937) When nutch is run on hadoop > 0.20.2 (or cdh) it will not find plugins because MapReduce will not unpack plugin/ directory from the job's pack (due to MAPREDUCE-967) - posted by "Julien Nioche (JIRA)" <ji...@apache.org> on 2011/09/01 13:04:09 UTC, 3 replies.
- [jira] [Resolved] (NUTCH-1073) Rename parameters 'fetcher.threads.per.host.by.ip' and 'fetcher.threads.per.host' - posted by "Julien Nioche (JIRA)" <ji...@apache.org> on 2011/09/01 15:09:09 UTC, 0 replies.
- [jira] [Commented] (NUTCH-1073) Rename parameters 'fetcher.threads.per.host.by.ip' and 'fetcher.threads.per.host' - posted by "Markus Jelsma (JIRA)" <ji...@apache.org> on 2011/09/01 15:27:09 UTC, 0 replies.
- [jira] [Resolved] (NUTCH-1096) Empty (not null) ContentLength results in failure of fetch - posted by "Julien Nioche (JIRA)" <ji...@apache.org> on 2011/09/01 17:16:17 UTC, 0 replies.
- [jira] [Commented] (NUTCH-1102) Fetcher, rely on fetcher.parse directive only - posted by "Lewis John McGibbney (JIRA)" <ji...@apache.org> on 2011/09/01 19:10:09 UTC, 2 replies.
- [jira] [Created] (NUTCH-1103) Port protocol-sftp to 1.4 - posted by "Markus Jelsma (JIRA)" <ji...@apache.org> on 2011/09/01 20:58:09 UTC, 0 replies.
- Re: Page deletion and tracking change between crawlings - posted by Julio Garcés Teuber <ju...@xinergia.com> on 2011/09/02 15:06:59 UTC, 2 replies.
- [jira] [Updated] (NUTCH-1097) application/xhtml+xml should be enabled in plugin.xml of parse-html; allow multiple mimetypes for plugin.xml - posted by "Ferdy (JIRA)" <ji...@apache.org> on 2011/09/02 15:34:09 UTC, 5 replies.
- Protocol not found or MalformedUrl protocol-sftp - posted by Markus Jelsma <ma...@openindex.io> on 2011/09/02 15:57:28 UTC, 1 replies.
- [jira] [Commented] (NUTCH-1097) application/xhtml+xml should be enabled in plugin.xml of parse-html; allow multiple mimetypes for plugin.xml - posted by "Ferdy (JIRA)" <ji...@apache.org> on 2011/09/02 16:00:10 UTC, 0 replies.
- [jira] [Issue Comment Edited] (NUTCH-1097) application/xhtml+xml should be enabled in plugin.xml of parse-html; allow multiple mimetypes for plugin.xml - posted by "Ferdy (JIRA)" <ji...@apache.org> on 2011/09/02 16:00:11 UTC, 1 replies.
- [Nutch Wiki] Trivial Update of "RunningNutchAndSolr" by LewisJohnMcgibbney - posted by Apache Wiki <wi...@apache.org> on 2011/09/02 21:18:19 UTC, 4 replies.
- [Nutch Wiki] Trivial Update of "NutchTutorial" by LewisJohnMcgibbney - posted by Apache Wiki <wi...@apache.org> on 2011/09/02 21:47:49 UTC, 5 replies.
- [Nutch Wiki] Trivial Update of "Archive and Legacy" by LewisJohnMcgibbney - posted by Apache Wiki <wi...@apache.org> on 2011/09/02 21:48:59 UTC, 4 replies.
- [Nutch Wiki] Trivial Update of "FrontPage" by LewisJohnMcgibbney - posted by Apache Wiki <wi...@apache.org> on 2011/09/02 21:52:01 UTC, 6 replies.
- [Nutch Wiki] Trivial Update of "OldHadoopTutorial" by LewisJohnMcgibbney - posted by Apache Wiki <wi...@apache.org> on 2011/09/02 21:58:02 UTC, 0 replies.
- [Nutch Wiki] Trivial Update of "NutchHadoopTutorial" by LewisJohnMcgibbney - posted by Apache Wiki <wi...@apache.org> on 2011/09/02 22:10:26 UTC, 0 replies.
- Build failed in Jenkins: Nutch-trunk #1592 - posted by Apache Jenkins Server <je...@builds.apache.org> on 2011/09/03 06:06:21 UTC, 0 replies.
- Build failed in Jenkins: Nutch-trunk #1593 - posted by Apache Jenkins Server <je...@builds.apache.org> on 2011/09/04 06:13:07 UTC, 0 replies.
- Build failed in Jenkins: Nutch-trunk #1594 - posted by Apache Jenkins Server <je...@builds.apache.org> on 2011/09/05 06:13:00 UTC, 0 replies.
- Build failed in Jenkins: Nutch-trunk #1595 - posted by Apache Jenkins Server <je...@builds.apache.org> on 2011/09/06 06:12:17 UTC, 0 replies.
- [jira] [Commented] (NUTCH-1052) Multiple deletes of the same URL using SolrClean - posted by "Markus Jelsma (JIRA)" <ji...@apache.org> on 2011/09/06 10:24:09 UTC, 8 replies.
- [jira] [Updated] (NUTCH-1052) Multiple deletes of the same URL using SolrClean - posted by "Markus Jelsma (JIRA)" <ji...@apache.org> on 2011/09/06 12:34:09 UTC, 3 replies.
- [jira] [Assigned] (NUTCH-1052) Multiple deletes of the same URL using SolrClean - posted by "Markus Jelsma (JIRA)" <ji...@apache.org> on 2011/09/06 13:35:10 UTC, 0 replies.
- [jira] [Commented] (NUTCH-1067) Configure minimum throughput for fetcher - posted by "Markus Jelsma (JIRA)" <ji...@apache.org> on 2011/09/06 13:53:10 UTC, 6 replies.
- [jira] [Commented] (NUTCH-1101) Options to purge db_gone records in updatedb - posted by "Markus Jelsma (JIRA)" <ji...@apache.org> on 2011/09/06 13:55:10 UTC, 1 replies.
- [jira] [Commented] (NUTCH-1028) Log parser keys - posted by "Markus Jelsma (JIRA)" <ji...@apache.org> on 2011/09/06 13:57:10 UTC, 0 replies.
- [jira] [Resolved] (NUTCH-987) Support HTTP auth for Solr communication - posted by "Markus Jelsma (JIRA)" <ji...@apache.org> on 2011/09/06 13:59:09 UTC, 0 replies.
- [jira] [Updated] (NUTCH-987) Support HTTP auth for Solr communication - posted by "Markus Jelsma (JIRA)" <ji...@apache.org> on 2011/09/06 13:59:09 UTC, 0 replies.
- [jira] [Updated] (NUTCH-1057) Make fetcher thread time out configurable - posted by "Markus Jelsma (JIRA)" <ji...@apache.org> on 2011/09/06 13:59:10 UTC, 0 replies.
- [jira] [Issue Comment Edited] (NUTCH-987) Support HTTP auth for Solr communication - posted by "Markus Jelsma (JIRA)" <ji...@apache.org> on 2011/09/06 13:59:10 UTC, 0 replies.
- [jira] [Resolved] (NUTCH-1057) Make fetcher thread time out configurable - posted by "Markus Jelsma (JIRA)" <ji...@apache.org> on 2011/09/06 13:59:10 UTC, 0 replies.
- [jira] [Created] (NUTCH-1104) Port issues from 1.x to trunk - posted by "Markus Jelsma (JIRA)" <ji...@apache.org> on 2011/09/06 13:59:10 UTC, 0 replies.
- [jira] [Updated] (NUTCH-1104) Port issues from 1.x to trunk - posted by "Markus Jelsma (JIRA)" <ji...@apache.org> on 2011/09/06 14:00:15 UTC, 7 replies.
- [jira] [Updated] (NUTCH-1036) Solr jobs should increment counters in Reporter - posted by "Markus Jelsma (JIRA)" <ji...@apache.org> on 2011/09/06 14:02:10 UTC, 0 replies.
- [jira] [Resolved] (NUTCH-1036) Solr jobs should increment counters in Reporter - posted by "Markus Jelsma (JIRA)" <ji...@apache.org> on 2011/09/06 14:02:10 UTC, 0 replies.
- [jira] [Updated] (NUTCH-1101) Options to purge db_gone records in updatedb - posted by "Markus Jelsma (JIRA)" <ji...@apache.org> on 2011/09/06 14:56:09 UTC, 0 replies.
- exposing generator.max.num.segments in nutch-default.xml and to Crawl command - posted by Ferdy Galema <fe...@kalooga.com> on 2011/09/06 17:07:46 UTC, 1 replies.
- Build failed in Jenkins: Nutch-trunk #1596 - posted by Apache Jenkins Server <je...@builds.apache.org> on 2011/09/07 06:14:00 UTC, 0 replies.
- [jira] [Commented] (NUTCH-1074) topN is ignored with maxNumSegments - posted by "Markus Jelsma (JIRA)" <ji...@apache.org> on 2011/09/07 11:53:10 UTC, 3 replies.
- [jira] [Issue Comment Edited] (NUTCH-1074) topN is ignored with maxNumSegments - posted by "Markus Jelsma (JIRA)" <ji...@apache.org> on 2011/09/07 11:55:16 UTC, 2 replies.
- [Nutch Wiki] Trivial Update of "PublicServers" by MarkusJelsma - posted by Apache Wiki <wi...@apache.org> on 2011/09/07 12:26:47 UTC, 2 replies.
- [jira] [Created] (NUTCH-1105) MaxContentLength option for index-basic - posted by "Markus Jelsma (JIRA)" <ji...@apache.org> on 2011/09/07 14:26:10 UTC, 0 replies.
- [jira] [Updated] (NUTCH-1105) MaxContentLength option for index-basic - posted by "Markus Jelsma (JIRA)" <ji...@apache.org> on 2011/09/07 14:26:10 UTC, 0 replies.
- Build failed in Jenkins: Nutch-trunk #1597 - posted by Apache Jenkins Server <je...@builds.apache.org> on 2011/09/08 06:05:19 UTC, 0 replies.
- [jira] [Closed] (NUTCH-1095) remove i18n from Nutch site to archive and legacy secton of wiki - posted by "Lewis John McGibbney (JIRA)" <ji...@apache.org> on 2011/09/08 19:34:08 UTC, 0 replies.
- [jira] [Resolved] (NUTCH-1095) remove i18n from Nutch site to archive and legacy secton of wiki - posted by "Lewis John McGibbney (JIRA)" <ji...@apache.org> on 2011/09/08 19:34:08 UTC, 0 replies.
- [jira] [Commented] (NUTCH-1099) Add HBase and Cassandra storage properties to nutch-default.xml - posted by "Lewis John McGibbney (JIRA)" <ji...@apache.org> on 2011/09/08 19:36:08 UTC, 2 replies.
- [Nutch Wiki] Trivial Update of "IndexStructure" by LewisJohnMcgibbney - posted by Apache Wiki <wi...@apache.org> on 2011/09/08 21:40:59 UTC, 3 replies.
- [jira] [Commented] (NUTCH-841) Nutch 2.0 webapp - posted by "Lewis John McGibbney (JIRA)" <ji...@apache.org> on 2011/09/08 22:16:08 UTC, 2 replies.
- Re: Correct Nutch tutorial - posted by lewis john mcgibbney <le...@gmail.com> on 2011/09/08 22:29:56 UTC, 0 replies.
- [Nutch Wiki] Trivial Update of "WritingPluginExample" by LewisJohnMcgibbney - posted by Apache Wiki <wi...@apache.org> on 2011/09/08 22:39:38 UTC, 2 replies.
- Build failed in Jenkins: Nutch-trunk #1598 - posted by Apache Jenkins Server <je...@builds.apache.org> on 2011/09/09 06:12:22 UTC, 0 replies.
- [jira] [Assigned] (NUTCH-1088) Write Solr XML documents - posted by "Markus Jelsma (JIRA)" <ji...@apache.org> on 2011/09/09 12:52:09 UTC, 0 replies.
- [jira] [Commented] (NUTCH-1105) MaxContentLength option for index-basic - posted by "Markus Jelsma (JIRA)" <ji...@apache.org> on 2011/09/09 13:08:08 UTC, 3 replies.
- [jira] [Resolved] (NUTCH-1101) Options to purge db_gone records in updatedb - posted by "Markus Jelsma (JIRA)" <ji...@apache.org> on 2011/09/09 13:18:08 UTC, 0 replies.
- [jira] [Created] (NUTCH-1106) Options to skip url's based on length - posted by "Markus Jelsma (JIRA)" <ji...@apache.org> on 2011/09/09 14:07:08 UTC, 0 replies.
- [jira] [Updated] (NUTCH-1106) Options to skip url's based on length - posted by "Markus Jelsma (JIRA)" <ji...@apache.org> on 2011/09/09 14:09:09 UTC, 1 replies.
- unsubscribe - posted by Greg Boulter <gr...@hotmail.com> on 2011/09/09 20:19:37 UTC, 0 replies.
- [jira] [Created] (NUTCH-1107) Log slow parse entries - posted by "Markus Jelsma (JIRA)" <ji...@apache.org> on 2011/09/09 22:00:09 UTC, 0 replies.
- [jira] [Created] (NUTCH-1108) Index image and video format with nutch 1.3 - posted by "hadi (JIRA)" <ji...@apache.org> on 2011/09/10 07:45:08 UTC, 0 replies.
- [jira] [Commented] (NUTCH-1108) Index image and video format with nutch 1.3 - posted by "Julien Nioche (JIRA)" <ji...@apache.org> on 2011/09/10 10:17:08 UTC, 6 replies.
- [jira] [Created] (NUTCH-1109) Add Sonar targets to Ant build.xml - posted by "Lewis John McGibbney (JIRA)" <ji...@apache.org> on 2011/09/10 14:31:08 UTC, 0 replies.
- [jira] [Updated] (NUTCH-1109) Add Sonar targets to Ant build.xml - posted by "Lewis John McGibbney (JIRA)" <ji...@apache.org> on 2011/09/10 14:39:08 UTC, 2 replies.
- [Nutch Wiki] Update of "NutchTutorial" by RichardLloyd - posted by Apache Wiki <wi...@apache.org> on 2011/09/10 16:37:54 UTC, 0 replies.
- [Nutch Wiki] Trivial Update of "NutchTutorial" by RichardLloyd - posted by Apache Wiki <wi...@apache.org> on 2011/09/10 16:52:03 UTC, 0 replies.
- [Nutch Wiki] Trivial Update of "CommandLineOptions" by LewisJohnMcgibbney - posted by Apache Wiki <wi...@apache.org> on 2011/09/10 18:56:11 UTC, 1 replies.
- [Nutch Wiki] Trivial Update of "bin/nutch domainstats" by LewisJohnMcgibbney - posted by Apache Wiki <wi...@apache.org> on 2011/09/10 19:03:41 UTC, 0 replies.
- [jira] [Commented] (NUTCH-296) Image Search - posted by "Lewis John McGibbney (JIRA)" <ji...@apache.org> on 2011/09/10 19:14:09 UTC, 0 replies.
- [jira] [Updated] (NUTCH-623) Change plugin source directory "languageidentifier" to "language-identifier" - posted by "Lewis John McGibbney (JIRA)" <ji...@apache.org> on 2011/09/10 22:39:09 UTC, 2 replies.
- [jira] [Updated] (NUTCH-940) static field plugin - posted by "Lewis John McGibbney (JIRA)" <ji...@apache.org> on 2011/09/10 22:59:09 UTC, 2 replies.
- [jira] [Commented] (NUTCH-940) static field plugin - posted by "Chris A. Mattmann (JIRA)" <ji...@apache.org> on 2011/09/10 23:35:08 UTC, 7 replies.
- [jira] [Commented] (NUTCH-914) Implement Apache Project Branding Requirements - posted by "Lewis John McGibbney (JIRA)" <ji...@apache.org> on 2011/09/11 01:51:08 UTC, 0 replies.
- Build failed in Jenkins: Nutch-trunk #1600 - posted by Apache Jenkins Server <je...@builds.apache.org> on 2011/09/11 06:12:33 UTC, 0 replies.
- [jira] [Closed] (NUTCH-914) Implement Apache Project Branding Requirements - posted by "Julien Nioche (JIRA)" <ji...@apache.org> on 2011/09/11 09:55:08 UTC, 0 replies.
- [jira] [Commented] (NUTCH-623) Change plugin source directory "languageidentifier" to "language-identifier" - posted by "Ignacio J. Ortega (JIRA)" <ji...@apache.org> on 2011/09/11 17:58:08 UTC, 3 replies.
- [jira] [Closed] (NUTCH-1099) Add HBase and Cassandra storage properties to nutch-default.xml - posted by "Lewis John McGibbney (JIRA)" <ji...@apache.org> on 2011/09/11 18:42:08 UTC, 0 replies.
- [jira] [Resolved] (NUTCH-1099) Add HBase and Cassandra storage properties to nutch-default.xml - posted by "Lewis John McGibbney (JIRA)" <ji...@apache.org> on 2011/09/11 18:42:08 UTC, 0 replies.
- [jira] [Resolved] (NUTCH-296) Image Search - posted by "Lewis John McGibbney (JIRA)" <ji...@apache.org> on 2011/09/11 18:44:08 UTC, 0 replies.
- [jira] [Updated] (NUTCH-1078) Upgrade all instances of commons logging to slf4j (with log4j backend) - posted by "Lewis John McGibbney (JIRA)" <ji...@apache.org> on 2011/09/11 22:22:08 UTC, 4 replies.
- Build failed in Jenkins: Nutch-trunk #1601 - posted by Apache Jenkins Server <je...@builds.apache.org> on 2011/09/12 06:10:08 UTC, 0 replies.
- How to see links in offline mode? - posted by ahmad ajiloo <ah...@gmail.com> on 2011/09/12 09:49:46 UTC, 2 replies.
- [jira] [Updated] (NUTCH-1005) Index headings h1 and h2 - posted by "Markus Jelsma (JIRA)" <ji...@apache.org> on 2011/09/12 11:27:08 UTC, 1 replies.
- [jira] [Closed] (NUTCH-1108) Index image and video format with nutch 1.3 - posted by "Markus Jelsma (JIRA)" <ji...@apache.org> on 2011/09/12 11:31:09 UTC, 0 replies.
- [jira] [Commented] (NUTCH-1106) Options to skip url's based on length - posted by "Markus Jelsma (JIRA)" <ji...@apache.org> on 2011/09/12 14:05:09 UTC, 2 replies.
- [jira] [Resolved] (NUTCH-1105) MaxContentLength option for index-basic - posted by "Markus Jelsma (JIRA)" <ji...@apache.org> on 2011/09/12 14:23:08 UTC, 0 replies.
- [jira] [Updated] (NUTCH-1005) Index headings plugin - posted by "Markus Jelsma (JIRA)" <ji...@apache.org> on 2011/09/12 14:53:12 UTC, 1 replies.
- [jira] [Resolved] (NUTCH-924) Static field in solr mapping - posted by "Markus Jelsma (JIRA)" <ji...@apache.org> on 2011/09/12 14:59:08 UTC, 0 replies.
- [jira] [Assigned] (NUTCH-1026) Strip UTF-8 non-character codepoints - posted by "Markus Jelsma (JIRA)" <ji...@apache.org> on 2011/09/12 15:03:08 UTC, 0 replies.
- [jira] [Assigned] (NUTCH-979) Add support for deleting Solr documents with ProtocolStatusCodes.NOTFOUND - posted by "Markus Jelsma (JIRA)" <ji...@apache.org> on 2011/09/12 15:03:09 UTC, 0 replies.
- [jira] [Issue Comment Edited] (NUTCH-1052) Multiple deletes of the same URL using SolrClean - posted by "Markus Jelsma (JIRA)" <ji...@apache.org> on 2011/09/12 15:45:09 UTC, 0 replies.
- [jira] [Commented] (NUTCH-970) Injector job crashes with MySQL with table collation set to utf8_general_ci - posted by "Markus Jelsma (JIRA)" <ji...@apache.org> on 2011/09/12 22:37:08 UTC, 1 replies.
- Build failed in Jenkins: Nutch-trunk #1602 - posted by Apache Jenkins Server <je...@builds.apache.org> on 2011/09/13 06:10:22 UTC, 0 replies.
- unbalanced tasks in Fetching - posted by Dillon Yang <di...@gmail.com> on 2011/09/13 06:40:55 UTC, 0 replies.
- [jira] [Created] (NUTCH-1110) Updatedb must not write _SUCCESS file - posted by "Markus Jelsma (JIRA)" <ji...@apache.org> on 2011/09/13 11:06:08 UTC, 0 replies.
- [jira] [Commented] (NUTCH-1084) ReadDB url throws exception - posted by "Markus Jelsma (JIRA)" <ji...@apache.org> on 2011/09/13 11:16:14 UTC, 1 replies.
- [jira] [Created] (NUTCH-1111) Cashed previous link - posted by "hadi (JIRA)" <ji...@apache.org> on 2011/09/13 12:06:09 UTC, 0 replies.
- [jira] [Closed] (NUTCH-1111) Cashed previous link - posted by "Markus Jelsma (JIRA)" <ji...@apache.org> on 2011/09/13 12:12:08 UTC, 0 replies.
- delete - posted by SC Interactive Global Media SRL <va...@interactivegm.com> on 2011/09/13 12:15:32 UTC, 0 replies.
- [jira] [Updated] (NUTCH-1110) Updatedb must not write _SUCCESS file - posted by "Markus Jelsma (JIRA)" <ji...@apache.org> on 2011/09/13 12:18:08 UTC, 0 replies.
- [jira] [Commented] (NUTCH-578) URL fetched with 403 is generated over and over again - posted by "Markus Jelsma (JIRA)" <ji...@apache.org> on 2011/09/13 12:20:08 UTC, 0 replies.
- [Nutch Wiki] Update of "CommandLineOptions" by MarkusJelsma - posted by Apache Wiki <wi...@apache.org> on 2011/09/13 14:09:16 UTC, 0 replies.
- [jira] [Resolved] (NUTCH-1110) Updatedb must not write _SUCCESS file - posted by "Markus Jelsma (JIRA)" <ji...@apache.org> on 2011/09/13 20:17:11 UTC, 0 replies.
- [jira] [Closed] (NUTCH-1051) Export WebGraph node scores for solr.ExternalFileField - posted by "Markus Jelsma (JIRA)" <ji...@apache.org> on 2011/09/13 23:27:09 UTC, 0 replies.
- [jira] [Closed] (NUTCH-1105) MaxContentLength option for index-basic - posted by "Markus Jelsma (JIRA)" <ji...@apache.org> on 2011/09/13 23:27:09 UTC, 0 replies.
- [jira] [Closed] (NUTCH-771) Add WebGraph classes to the bin/nutch script - posted by "Markus Jelsma (JIRA)" <ji...@apache.org> on 2011/09/13 23:27:09 UTC, 0 replies.
- [jira] [Closed] (NUTCH-987) Support HTTP auth for Solr communication - posted by "Markus Jelsma (JIRA)" <ji...@apache.org> on 2011/09/13 23:27:09 UTC, 0 replies.
- [jira] [Closed] (NUTCH-1004) Do not index empty values for title field - posted by "Markus Jelsma (JIRA)" <ji...@apache.org> on 2011/09/13 23:27:09 UTC, 0 replies.
- [jira] [Closed] (NUTCH-1050) Add segmentDir option to WebGraph - posted by "Markus Jelsma (JIRA)" <ji...@apache.org> on 2011/09/13 23:27:09 UTC, 0 replies.
- [jira] [Closed] (NUTCH-1027) Degrade log level of `can't find rules for scope` - posted by "Markus Jelsma (JIRA)" <ji...@apache.org> on 2011/09/13 23:27:09 UTC, 0 replies.
- [jira] [Closed] (NUTCH-1012) Cannot handle illegal charset $charset - posted by "Markus Jelsma (JIRA)" <ji...@apache.org> on 2011/09/13 23:27:10 UTC, 0 replies.
- [jira] [Closed] (NUTCH-1006) meta equiv with single quotes not accepted - posted by "Markus Jelsma (JIRA)" <ji...@apache.org> on 2011/09/13 23:27:10 UTC, 0 replies.
- [jira] [Closed] (NUTCH-1030) WebgraphDB program requires manually added directories - posted by "Markus Jelsma (JIRA)" <ji...@apache.org> on 2011/09/13 23:27:10 UTC, 0 replies.
- [jira] [Closed] (NUTCH-1037) Deduplicate anchors before indexing - posted by "Markus Jelsma (JIRA)" <ji...@apache.org> on 2011/09/13 23:27:10 UTC, 0 replies.
- [jira] [Closed] (NUTCH-1110) Updatedb must not write _SUCCESS file - posted by "Markus Jelsma (JIRA)" <ji...@apache.org> on 2011/09/13 23:27:10 UTC, 0 replies.
- [jira] [Closed] (NUTCH-1049) Add classes to bin/nutch - posted by "Markus Jelsma (JIRA)" <ji...@apache.org> on 2011/09/13 23:27:11 UTC, 0 replies.
- [jira] [Closed] (NUTCH-1069) Readlinkdb broken on Hadoop > 0.20 - posted by "Markus Jelsma (JIRA)" <ji...@apache.org> on 2011/09/13 23:27:11 UTC, 0 replies.
- [jira] [Closed] (NUTCH-1010) ContentLength not trimmed - posted by "Markus Jelsma (JIRA)" <ji...@apache.org> on 2011/09/13 23:27:11 UTC, 0 replies.
- [jira] [Closed] (NUTCH-1000) Add option not to commit to Solr - posted by "Markus Jelsma (JIRA)" <ji...@apache.org> on 2011/09/13 23:27:11 UTC, 0 replies.
- [jira] [Closed] (NUTCH-1036) Solr jobs should increment counters in Reporter - posted by "Markus Jelsma (JIRA)" <ji...@apache.org> on 2011/09/13 23:27:11 UTC, 0 replies.
- [jira] [Closed] (NUTCH-1082) IndexingFiltersChecker utility does not list multi valued fields - posted by "Markus Jelsma (JIRA)" <ji...@apache.org> on 2011/09/13 23:27:11 UTC, 0 replies.
- [jira] [Closed] (NUTCH-1029) Readdb throws EOFException - posted by "Markus Jelsma (JIRA)" <ji...@apache.org> on 2011/09/13 23:27:11 UTC, 0 replies.
- Build failed in Jenkins: Nutch-trunk #1603 - posted by Apache Jenkins Server <je...@builds.apache.org> on 2011/09/14 07:09:46 UTC, 0 replies.
- [jira] [Created] (NUTCH-1112) protocol-httpclient doesn't accept content when all of it fits in the buffer at once - posted by "Edward Drapkin (JIRA)" <ji...@apache.org> on 2011/09/14 08:28:16 UTC, 0 replies.
- [jira] [Commented] (NUTCH-1112) protocol-httpclient doesn't accept content when all of it fits in the buffer at once - posted by "Edward Drapkin (JIRA)" <ji...@apache.org> on 2011/09/14 08:30:10 UTC, 0 replies.
- [jira] [Updated] (NUTCH-1112) off-by-one error in protocol-httpclient; truncates up to HttpBase.BUFFER_SIZE content - posted by "Edward Drapkin (JIRA)" <ji...@apache.org> on 2011/09/14 09:24:10 UTC, 3 replies.
- [jira] [Commented] (NUTCH-1112) off-by-one error in protocol-httpclient; truncates up to HttpBase.BUFFER_SIZE content - posted by "Lewis John McGibbney (JIRA)" <ji...@apache.org> on 2011/09/14 11:31:10 UTC, 1 replies.
- [jira] [Resolved] (NUTCH-1067) Configure minimum throughput for fetcher - posted by "Markus Jelsma (JIRA)" <ji...@apache.org> on 2011/09/14 13:01:09 UTC, 1 replies.
- [jira] [Resolved] (NUTCH-1102) Fetcher, rely on fetcher.parse directive only - posted by "Markus Jelsma (JIRA)" <ji...@apache.org> on 2011/09/14 13:01:09 UTC, 0 replies.
- [jira] [Commented] (NUTCH-1079) StringBuffer converted to StringBuilder - posted by "Markus Jelsma (JIRA)" <ji...@apache.org> on 2011/09/14 13:19:08 UTC, 1 replies.
- [jira] [Commented] (NUTCH-1005) Index headings plugin - posted by "Julien Nioche (JIRA)" <ji...@apache.org> on 2011/09/14 13:41:09 UTC, 6 replies.
- [jira] [Reopened] (NUTCH-1067) Configure minimum throughput for fetcher - posted by "Julien Nioche (JIRA)" <ji...@apache.org> on 2011/09/14 13:51:09 UTC, 0 replies.
- [jira] [Updated] (NUTCH-1067) Configure minimum throughput for fetcher - posted by "Markus Jelsma (JIRA)" <ji...@apache.org> on 2011/09/14 14:07:08 UTC, 0 replies.
- Setting up Jenkins CI for Nutch Branches - posted by lewis john mcgibbney <le...@gmail.com> on 2011/09/15 20:08:35 UTC, 4 replies.
- Re: Future of Nutch 2.0 [Was: Unresolved dependencies org.apache.gora#gora-hbase;0.1: not found in Nutch trunk] - posted by Markus Jelsma <ma...@openindex.io> on 2011/09/15 20:55:25 UTC, 10 replies.
- [Nutch Wiki] Trivial Update of "OldFAQs" by LewisJohnMcgibbney - posted by Apache Wiki <wi...@apache.org> on 2011/09/15 21:00:13 UTC, 2 replies.
- [Nutch Wiki] Trivial Update of "ErrorMessages" by LewisJohnMcgibbney - posted by Apache Wiki <wi...@apache.org> on 2011/09/15 21:44:53 UTC, 3 replies.
- [Nutch Wiki] Trivial Update of "FAQ" by LewisJohnMcgibbney - posted by Apache Wiki <wi...@apache.org> on 2011/09/15 22:00:23 UTC, 3 replies.
- [jira] [Created] (NUTCH-1113) Merging segments causes URLs to vanish from crawldb/index? - posted by "Edward Drapkin (JIRA)" <ji...@apache.org> on 2011/09/15 22:03:08 UTC, 0 replies.
- [jira] [Commented] (NUTCH-1113) Merging segments causes URLs to vanish from crawldb/index? - posted by "Markus Jelsma (JIRA)" <ji...@apache.org> on 2011/09/15 22:25:10 UTC, 6 replies.
- [jira] [Commented] (NUTCH-251) Administration GUI - posted by "hadi (JIRA)" <ji...@apache.org> on 2011/09/15 22:51:14 UTC, 3 replies.
- [jira] [Updated] (NUTCH-1113) Merging segments causes URLs to vanish from crawldb/index? - posted by "Edward Drapkin (JIRA)" <ji...@apache.org> on 2011/09/15 23:21:09 UTC, 2 replies.
- [jira] [Issue Comment Edited] (NUTCH-1113) Merging segments causes URLs to vanish from crawldb/index? - posted by "Edward Drapkin (JIRA)" <ji...@apache.org> on 2011/09/15 23:53:09 UTC, 0 replies.
- Build failed in Jenkins: Nutch-branch-1.4 #1 - posted by Apache Jenkins Server <je...@builds.apache.org> on 2011/09/16 00:08:46 UTC, 0 replies.
- Build failed in Jenkins: Nutch-branch-1.4 #2 - posted by Apache Jenkins Server <je...@builds.apache.org> on 2011/09/16 00:09:09 UTC, 1 replies.
- Build failed in Jenkins: Nutch-branch-1.4 #3 - posted by Apache Jenkins Server <je...@builds.apache.org> on 2011/09/16 00:22:52 UTC, 0 replies.
- Build failed in Jenkins: Nutch-trunk #1605 - posted by Apache Jenkins Server <je...@builds.apache.org> on 2011/09/16 06:12:00 UTC, 0 replies.
- [jira] [Closed] (NUTCH-1112) off-by-one error in protocol-httpclient; truncates up to HttpBase.BUFFER_SIZE content - posted by "Julien Nioche (JIRA)" <ji...@apache.org> on 2011/09/16 10:43:09 UTC, 0 replies.
- Build failed in Jenkins: Nutch-branch-1.4 #5 - posted by Apache Jenkins Server <je...@builds.apache.org> on 2011/09/16 12:51:51 UTC, 0 replies.
- Build failed in Jenkins: Nutch-branch-1.4 #6 - posted by Apache Jenkins Server <je...@builds.apache.org> on 2011/09/16 13:05:54 UTC, 0 replies.
- Jenkins build is back to normal : Nutch-branch-1.4 #7 - posted by Apache Jenkins Server <je...@builds.apache.org> on 2011/09/16 13:11:26 UTC, 4 replies.
- [jira] [Commented] (NUTCH-1078) Upgrade all instances of commons logging to slf4j (with log4j backend) - posted by "Markus Jelsma (JIRA)" <ji...@apache.org> on 2011/09/16 15:36:09 UTC, 18 replies.
- [jira] [Issue Comment Edited] (NUTCH-1078) Upgrade all instances of commons logging to slf4j (with log4j backend) - posted by "Lewis John McGibbney (JIRA)" <ji...@apache.org> on 2011/09/16 18:24:09 UTC, 0 replies.
- [Nutch Wiki] Trivial Update of "Website_Update_HOWTO" by LewisJohnMcgibbney - posted by Apache Wiki <wi...@apache.org> on 2011/09/16 18:45:02 UTC, 2 replies.
- Build failed in Jenkins: Nutch-trunk #1606 - posted by Apache Jenkins Server <je...@builds.apache.org> on 2011/09/17 06:05:23 UTC, 0 replies.
- [jira] [Commented] (NUTCH-1092) overhaul FAQ's and publish to Nutch site - posted by "Lewis John McGibbney (JIRA)" <ji...@apache.org> on 2011/09/17 20:47:09 UTC, 1 replies.
- [jira] [Issue Comment Edited] (NUTCH-1092) overhaul FAQ's and publish to Nutch site - posted by "Lewis John McGibbney (JIRA)" <ji...@apache.org> on 2011/09/17 20:51:08 UTC, 0 replies.
- [jira] [Commented] (NUTCH-585) [PARSE-HTML plugin] Block certain parts of HTML code from being indexed - posted by "Rui Araújo (JIRA)" <ji...@apache.org> on 2011/09/18 02:00:16 UTC, 1 replies.
- [jira] [Updated] (NUTCH-585) [PARSE-HTML plugin] Block certain parts of HTML code from being indexed - posted by "Rui Araújo (JIRA)" <ji...@apache.org> on 2011/09/18 02:00:17 UTC, 3 replies.
- Build failed in Jenkins: Nutch-trunk #1607 - posted by Apache Jenkins Server <je...@builds.apache.org> on 2011/09/18 06:04:23 UTC, 0 replies.
- Build failed in Jenkins: Nutch-branch-1.4 #8 - posted by Apache Jenkins Server <je...@builds.apache.org> on 2011/09/18 06:31:19 UTC, 3 replies.
- [VOTE] Move 2.0 out of trunk - posted by Julien Nioche <li...@gmail.com> on 2011/09/18 11:21:27 UTC, 11 replies.
- Jenkins build is back to normal : Nutch-branch-1.4 #9 - posted by Apache Jenkins Server <je...@builds.apache.org> on 2011/09/18 21:29:31 UTC, 0 replies.
- [jira] [Commented] (NUTCH-1109) Add Sonar targets to Ant build.xml - posted by "Lewis John McGibbney (JIRA)" <ji...@apache.org> on 2011/09/18 21:53:08 UTC, 1 replies.
- [DISCUSS] What will happen to Nutch Gora aka Nutchbase (was Re: [VOTE] Move 2.0 out of trunk) - posted by "Mattmann, Chris A (388J)" <ch...@jpl.nasa.gov> on 2011/09/19 01:24:54 UTC, 3 replies.
- [jira] [Assigned] (NUTCH-1025) Add option not to commit to Solr - posted by "Markus Jelsma (JIRA)" <ji...@apache.org> on 2011/09/19 14:05:14 UTC, 0 replies.
- [jira] [Created] (NUTCH-1114) Attr file missing in domain filter - posted by "Markus Jelsma (JIRA)" <ji...@apache.org> on 2011/09/19 16:13:09 UTC, 0 replies.
- [jira] [Commented] (NUTCH-1114) Attr file missing in domain filter - posted by "Markus Jelsma (JIRA)" <ji...@apache.org> on 2011/09/19 16:17:08 UTC, 1 replies.
- [jira] [Resolved] (NUTCH-1114) Attr file missing in domain filter - posted by "Markus Jelsma (JIRA)" <ji...@apache.org> on 2011/09/19 16:17:09 UTC, 0 replies.
- [jira] [Updated] (NUTCH-1028) Log parser keys - posted by "Markus Jelsma (JIRA)" <ji...@apache.org> on 2011/09/19 17:12:09 UTC, 0 replies.
- [jira] [Resolved] (NUTCH-940) static field plugin - posted by "Lewis John McGibbney (JIRA)" <ji...@apache.org> on 2011/09/19 17:29:09 UTC, 0 replies.
- [jira] [Resolved] (NUTCH-1028) Log parser keys - posted by "Markus Jelsma (JIRA)" <ji...@apache.org> on 2011/09/19 17:29:09 UTC, 0 replies.
- [jira] [Closed] (NUTCH-940) static field plugin - posted by "Lewis John McGibbney (JIRA)" <ji...@apache.org> on 2011/09/19 17:29:10 UTC, 0 replies.
- [jira] [Commented] (NUTCH-208) http: proxy exception list: - posted by "Lewis John McGibbney (JIRA)" <ji...@apache.org> on 2011/09/19 17:35:09 UTC, 1 replies.
- [jira] [Updated] (NUTCH-208) http: proxy exception list: - posted by "Lewis John McGibbney (JIRA)" <ji...@apache.org> on 2011/09/19 17:41:10 UTC, 3 replies.
- [jira] [Created] (NUTCH-1115) Option to disable fixing of embedded params in DomContentUtils - posted by "Markus Jelsma (JIRA)" <ji...@apache.org> on 2011/09/19 18:55:08 UTC, 0 replies.
- [RESULT] [VOTE] Move 2.0 out of trunk - posted by Julien Nioche <li...@gmail.com> on 2011/09/21 12:09:21 UTC, 6 replies.
- Extension of NUTCH-585 - blacklist whitelist plugin - posted by Elisabeth Adler <el...@gmail.com> on 2011/09/21 17:47:45 UTC, 2 replies.
- [jira] [Updated] (NUTCH-1115) Option to disable fixing of embedded params in DomContentUtils - posted by "Markus Jelsma (JIRA)" <ji...@apache.org> on 2011/09/22 14:46:26 UTC, 1 replies.
- [jira] [Commented] (NUTCH-1115) Option to disable fixing of embedded params in DomContentUtils - posted by "Markus Jelsma (JIRA)" <ji...@apache.org> on 2011/09/22 14:55:26 UTC, 2 replies.
- [jira] [Resolved] (NUTCH-1115) Option to disable fixing of embedded params in DomContentUtils - posted by "Markus Jelsma (JIRA)" <ji...@apache.org> on 2011/09/22 16:04:27 UTC, 0 replies.
- [jira] [Resolved] (NUTCH-1092) overhaul FAQ's and publish to Nutch site - posted by "Lewis John McGibbney (JIRA)" <ji...@apache.org> on 2011/09/22 17:06:26 UTC, 0 replies.
- [jira] [Resolved] (NUTCH-1078) Upgrade all instances of commons logging to slf4j (with log4j backend) - posted by "Lewis John McGibbney (JIRA)" <ji...@apache.org> on 2011/09/22 17:12:26 UTC, 1 replies.
- [jira] [Closed] (NUTCH-1078) Upgrade all instances of commons logging to slf4j (with log4j backend) - posted by "Lewis John McGibbney (JIRA)" <ji...@apache.org> on 2011/09/22 17:12:26 UTC, 1 replies.
- [jira] [Assigned] (NUTCH-1074) topN is ignored with maxNumSegments - posted by "Markus Jelsma (JIRA)" <ji...@apache.org> on 2011/09/22 21:35:28 UTC, 0 replies.
- [jira] [Reopened] (NUTCH-1078) Upgrade all instances of commons logging to slf4j (with log4j backend) - posted by "Markus Jelsma (JIRA)" <ji...@apache.org> on 2011/09/22 21:40:28 UTC, 0 replies.
- [jira] [Updated] (NUTCH-1074) topN is ignored with maxNumSegments - posted by "Robert Thomson (JIRA)" <ji...@apache.org> on 2011/09/23 04:39:26 UTC, 0 replies.
- [jira] [Resolved] (NUTCH-1074) topN is ignored with maxNumSegments - posted by "Markus Jelsma (JIRA)" <ji...@apache.org> on 2011/09/23 14:10:26 UTC, 0 replies.
- [jira] [Assigned] (NUTCH-1093) create core documentation - posted by "Lewis John McGibbney (JIRA)" <ji...@apache.org> on 2011/09/23 19:26:26 UTC, 0 replies.
- Nutch site documentation - posted by lewis john mcgibbney <le...@gmail.com> on 2011/09/23 20:10:41 UTC, 2 replies.
- [jira] [Updated] (NUTCH-1091) Remove commons logging dependency from Nutch branch and trunk - posted by "Lewis John McGibbney (JIRA)" <ji...@apache.org> on 2011/09/23 22:33:26 UTC, 1 replies.
- [NOTICE] Nutch trunk is now 1.4-snapshot and Nutch 2.0 trunk is now the Nutch Gora branch - posted by "Mattmann, Chris A (388J)" <ch...@jpl.nasa.gov> on 2011/09/24 01:57:27 UTC, 8 replies.
- [jira] [Resolved] (NUTCH-1093) create core documentation - posted by "Lewis John McGibbney (JIRA)" <ji...@apache.org> on 2011/09/24 13:23:26 UTC, 0 replies.
- [jira] [Closed] (NUTCH-1093) create core documentation - posted by "Lewis John McGibbney (JIRA)" <ji...@apache.org> on 2011/09/24 13:23:26 UTC, 0 replies.
- [jira] [Commented] (NUTCH-1093) create core documentation - posted by "Lewis John McGibbney (JIRA)" <ji...@apache.org> on 2011/09/24 13:23:26 UTC, 0 replies.
- [jira] [Updated] (NUTCH-1081) ant tests fail - posted by "Lewis John McGibbney (JIRA)" <ji...@apache.org> on 2011/09/24 13:25:26 UTC, 1 replies.
- Nutchgora Jenkins CI builds - posted by lewis john mcgibbney <le...@gmail.com> on 2011/09/24 14:16:24 UTC, 4 replies.
- Build failed in Jenkins: Nutch-trunk #1608 - posted by Apache Jenkins Server <je...@builds.apache.org> on 2011/09/24 14:24:05 UTC, 1 replies.
- [jira] [Commented] (NUTCH-881) Good quality documentation for Nutch - posted by "Lewis John McGibbney (JIRA)" <ji...@apache.org> on 2011/09/24 14:41:26 UTC, 0 replies.
- Build failed in Jenkins: Nutch-trunk #1609 - posted by Apache Jenkins Server <je...@builds.apache.org> on 2011/09/24 14:45:56 UTC, 0 replies.
- [jira] [Commented] (NUTCH-657) Estonian N-gram profile has wrong name - posted by "Lewis John McGibbney (JIRA)" <ji...@apache.org> on 2011/09/24 14:47:26 UTC, 2 replies.
- Jenkins build is back to normal : Nutch-trunk #1610 - posted by Apache Jenkins Server <je...@builds.apache.org> on 2011/09/24 14:58:05 UTC, 0 replies.
- [jira] [Resolved] (NUTCH-623) Change plugin source directory "languageidentifier" to "language-identifier" - posted by "Lewis John McGibbney (JIRA)" <ji...@apache.org> on 2011/09/24 18:27:26 UTC, 0 replies.
- [jira] [Created] (NUTCH-1116) Write JUnit tests for all plugins - posted by "Lewis John McGibbney (JIRA)" <ji...@apache.org> on 2011/09/24 18:35:26 UTC, 0 replies.
- [jira] [Created] (NUTCH-1117) JUnit test for index-anchor - posted by "Lewis John McGibbney (JIRA)" <ji...@apache.org> on 2011/09/24 18:37:26 UTC, 0 replies.
- [jira] [Created] (NUTCH-1119) JUnit test for index-static - posted by "Lewis John McGibbney (JIRA)" <ji...@apache.org> on 2011/09/24 18:39:26 UTC, 0 replies.
- [jira] [Created] (NUTCH-1118) JUnit test for index-basic - posted by "Lewis John McGibbney (JIRA)" <ji...@apache.org> on 2011/09/24 18:39:26 UTC, 0 replies.
- [jira] [Created] (NUTCH-1120) JUnit test for microformats-reltag - posted by "Lewis John McGibbney (JIRA)" <ji...@apache.org> on 2011/09/24 18:41:26 UTC, 0 replies.
- [jira] [Created] (NUTCH-1121) JUnit test for parse-js - posted by "Lewis John McGibbney (JIRA)" <ji...@apache.org> on 2011/09/24 18:43:26 UTC, 0 replies.
- [jira] [Created] (NUTCH-1122) JUnit test for protocol-ftp - posted by "Lewis John McGibbney (JIRA)" <ji...@apache.org> on 2011/09/24 18:43:26 UTC, 0 replies.
- [jira] [Created] (NUTCH-1123) JUnit test for scoring-link - posted by "Lewis John McGibbney (JIRA)" <ji...@apache.org> on 2011/09/24 18:45:26 UTC, 0 replies.
- [jira] [Created] (NUTCH-1124) JUnit test for scoring-opic - posted by "Lewis John McGibbney (JIRA)" <ji...@apache.org> on 2011/09/24 18:45:26 UTC, 0 replies.
- [jira] [Created] (NUTCH-1126) JUnit test for urlfilter-prefix - posted by "Lewis John McGibbney (JIRA)" <ji...@apache.org> on 2011/09/24 18:47:26 UTC, 0 replies.
- [jira] [Created] (NUTCH-1125) JUnit test for tld - posted by "Lewis John McGibbney (JIRA)" <ji...@apache.org> on 2011/09/24 18:47:26 UTC, 0 replies.
- [jira] [Created] (NUTCH-1127) JUnit test for urlfilter-validator - posted by "Lewis John McGibbney (JIRA)" <ji...@apache.org> on 2011/09/24 18:49:26 UTC, 0 replies.
- [jira] [Created] (NUTCH-1128) JUnit test for urlmeta - posted by "Lewis John McGibbney (JIRA)" <ji...@apache.org> on 2011/09/24 18:49:26 UTC, 0 replies.
- [jira] [Created] (NUTCH-1129) Any23 Nutch plugin - posted by "Lewis John McGibbney (JIRA)" <ji...@apache.org> on 2011/09/24 23:14:26 UTC, 0 replies.
- [jira] [Created] (NUTCH-1130) JUnit test for Any23 RDF plugin - posted by "Lewis John McGibbney (JIRA)" <ji...@apache.org> on 2011/09/24 23:16:26 UTC, 0 replies.
- [jira] [Commented] (NUTCH-1129) Any23 Nutch plugin - posted by "Julien Nioche (JIRA)" <ji...@apache.org> on 2011/09/24 23:26:26 UTC, 1 replies.
- Build failed in Jenkins: Nutch-trunk #1611 - posted by Apache Jenkins Server <je...@builds.apache.org> on 2011/09/25 06:22:20 UTC, 0 replies.
- [jira] [Updated] (NUTCH-1098) better url-normalizer basic - posted by "Radim Kolar (JIRA)" <ji...@apache.org> on 2011/09/25 21:27:26 UTC, 1 replies.
- [jira] [Created] (NUTCH-1131) Rely on published artefacts for GORA - posted by "Julien Nioche (JIRA)" <ji...@apache.org> on 2011/09/25 22:43:26 UTC, 0 replies.
- [jira] [Closed] (NUTCH-1131) Rely on published artefacts for GORA - posted by "Julien Nioche (JIRA)" <ji...@apache.org> on 2011/09/25 22:49:26 UTC, 0 replies.
- [jira] [Resolved] (NUTCH-1131) Rely on published artefacts for GORA - posted by "Julien Nioche (JIRA)" <ji...@apache.org> on 2011/09/25 22:49:26 UTC, 0 replies.
- Build failed in Jenkins: Nutch-trunk #1612 - posted by Apache Jenkins Server <je...@builds.apache.org> on 2011/09/26 06:25:39 UTC, 0 replies.
- Jenkins build is back to normal : Nutch-trunk #1613 - posted by Apache Jenkins Server <je...@builds.apache.org> on 2011/09/26 18:56:10 UTC, 1 replies.
- [jira] [Commented] (NUTCH-1081) ant tests fail - posted by "Lewis John McGibbney (JIRA)" <ji...@apache.org> on 2011/09/26 19:17:26 UTC, 0 replies.
- [jira] [Created] (NUTCH-1132) Fix TestGenerator for Nutchgora - posted by "Lewis John McGibbney (JIRA)" <ji...@apache.org> on 2011/09/26 19:19:28 UTC, 0 replies.
- [jira] [Created] (NUTCH-1133) Fix TestInjector for Nutchgora - posted by "Lewis John McGibbney (JIRA)" <ji...@apache.org> on 2011/09/26 19:21:26 UTC, 0 replies.
- [jira] [Created] (NUTCH-1135) Fix TestGoraStorage for Nutchgora - posted by "Lewis John McGibbney (JIRA)" <ji...@apache.org> on 2011/09/26 19:23:26 UTC, 0 replies.
- [jira] [Assigned] (NUTCH-1133) Fix TestInjector for Nutchgora - posted by "Lewis John McGibbney (JIRA)" <ji...@apache.org> on 2011/09/26 19:23:26 UTC, 0 replies.
- [jira] [Created] (NUTCH-1134) Fix TestFetcher for Nutchgora - posted by "Lewis John McGibbney (JIRA)" <ji...@apache.org> on 2011/09/26 19:23:26 UTC, 0 replies.
- Providing a list of FAQ's with every new subscribe request - posted by lewis john mcgibbney <le...@gmail.com> on 2011/09/26 19:53:00 UTC, 9 replies.
- [Nutch Wiki] Trivial Update of "Mailing" by LewisJohnMcgibbney - posted by Apache Wiki <wi...@apache.org> on 2011/09/26 22:28:26 UTC, 0 replies.
- unsub - posted by Christopher Bader <cb...@kratylos.com> on 2011/09/27 03:13:33 UTC, 0 replies.
- Build failed in Jenkins: Nutch-nutchgora #18 - posted by Apache Jenkins Server <je...@builds.apache.org> on 2011/09/27 06:06:48 UTC, 0 replies.
- Jenkins build is back to normal : Nutch-trunk #1615 - posted by Apache Jenkins Server <je...@builds.apache.org> on 2011/09/27 06:23:17 UTC, 0 replies.
- [jira] [Created] (NUTCH-1136) Ant pmd target is broken - posted by "Lewis John McGibbney (Created) (JIRA)" <ji...@apache.org> on 2011/09/27 21:03:45 UTC, 0 replies.
- Prepare for 1.4 release? - posted by Markus Jelsma <ma...@openindex.io> on 2011/09/27 23:01:11 UTC, 8 replies.
- Build failed in Jenkins: Nutch-trunk #1616 - posted by Apache Jenkins Server <je...@builds.apache.org> on 2011/09/28 06:04:12 UTC, 1 replies.
- Build failed in Jenkins: Nutch-nutchgora #19 - posted by Apache Jenkins Server <je...@builds.apache.org> on 2011/09/28 06:06:15 UTC, 0 replies.
- [jira] [Updated] (NUTCH-1090) LinkDb (invertlinks) should inform the user when it ignores internal links - posted by "Julien Nioche (Updated) (JIRA)" <ji...@apache.org> on 2011/09/28 10:42:45 UTC, 0 replies.
- [jira] [Updated] (NUTCH-578) URL fetched with 403 is generated over and over again - posted by "Markus Jelsma (Updated) (JIRA)" <ji...@apache.org> on 2011/09/28 11:02:46 UTC, 0 replies.
- [jira] [Updated] (NUTCH-1040) Backport REST-API from 2.0 - posted by "Julien Nioche (Updated) (JIRA)" <ji...@apache.org> on 2011/09/28 12:55:45 UTC, 1 replies.
- [jira] [Updated] (NUTCH-1031) Delegate parsing of robots.txt to crawler-commons - posted by "Julien Nioche (Updated) (JIRA)" <ji...@apache.org> on 2011/09/28 13:03:45 UTC, 0 replies.
- [jira] [Assigned] (NUTCH-1039) Fetcher fails for pages without content-length header - posted by "Julien Nioche (Assigned) (JIRA)" <ji...@apache.org> on 2011/09/28 13:03:45 UTC, 0 replies.
- [jira] [Updated] (NUTCH-1129) Any23 Nutch plugin - posted by "Julien Nioche (Updated) (JIRA)" <ji...@apache.org> on 2011/09/28 13:05:46 UTC, 0 replies.
- [jira] [Resolved] (NUTCH-937) When nutch is run on hadoop > 0.20.2 (or cdh) it will not find plugins because MapReduce will not unpack plugin/ directory from the job's pack (due to MAPREDUCE-967) - posted by "Julien Nioche (Resolved) (JIRA)" <ji...@apache.org> on 2011/09/28 13:19:45 UTC, 0 replies.
- [jira] [Updated] (NUTCH-1079) StringBuffer converted to StringBuilder - posted by "Julien Nioche (Updated) (JIRA)" <ji...@apache.org> on 2011/09/28 13:27:45 UTC, 0 replies.
- [jira] [Updated] (NUTCH-1047) Pluggable indexing backends - posted by "Julien Nioche (Updated) (JIRA)" <ji...@apache.org> on 2011/09/28 13:43:45 UTC, 0 replies.
- [jira] [Updated] (NUTCH-1088) Write Solr XML documents - posted by "Julien Nioche (Updated) (JIRA)" <ji...@apache.org> on 2011/09/28 13:51:45 UTC, 0 replies.
- [jira] [Updated] (NUTCH-1116) Write JUnit tests for all plugins - posted by "Julien Nioche (Updated) (JIRA)" <ji...@apache.org> on 2011/09/28 13:53:45 UTC, 0 replies.
- [jira] [Updated] (NUTCH-1118) JUnit test for index-basic - posted by "Julien Nioche (Updated) (JIRA)" <ji...@apache.org> on 2011/09/28 13:59:45 UTC, 0 replies.
- [jira] [Updated] (NUTCH-1119) JUnit test for index-static - posted by "Julien Nioche (Updated) (JIRA)" <ji...@apache.org> on 2011/09/28 13:59:45 UTC, 0 replies.
- [jira] [Updated] (NUTCH-1117) JUnit test for index-anchor - posted by "Julien Nioche (Updated) (JIRA)" <ji...@apache.org> on 2011/09/28 13:59:45 UTC, 0 replies.
- [jira] [Updated] (NUTCH-1127) JUnit test for urlfilter-validator - posted by "Julien Nioche (Updated) (JIRA)" <ji...@apache.org> on 2011/09/28 14:01:46 UTC, 0 replies.
- [jira] [Updated] (NUTCH-1124) JUnit test for scoring-opic - posted by "Julien Nioche (Updated) (JIRA)" <ji...@apache.org> on 2011/09/28 14:01:46 UTC, 0 replies.
- [jira] [Updated] (NUTCH-1128) JUnit test for urlmeta - posted by "Julien Nioche (Updated) (JIRA)" <ji...@apache.org> on 2011/09/28 14:01:46 UTC, 0 replies.
- [jira] [Updated] (NUTCH-1123) JUnit test for scoring-link - posted by "Julien Nioche (Updated) (JIRA)" <ji...@apache.org> on 2011/09/28 14:01:47 UTC, 0 replies.
- [jira] [Updated] (NUTCH-1130) JUnit test for Any23 RDF plugin - posted by "Julien Nioche (Updated) (JIRA)" <ji...@apache.org> on 2011/09/28 14:03:45 UTC, 0 replies.
- [jira] [Updated] (NUTCH-1126) JUnit test for urlfilter-prefix - posted by "Julien Nioche (Updated) (JIRA)" <ji...@apache.org> on 2011/09/28 14:03:45 UTC, 0 replies.
- [jira] [Updated] (NUTCH-1125) JUnit test for tld - posted by "Julien Nioche (Updated) (JIRA)" <ji...@apache.org> on 2011/09/28 15:09:45 UTC, 0 replies.
- [jira] [Updated] (NUTCH-1120) JUnit test for microformats-reltag - posted by "Julien Nioche (Updated) (JIRA)" <ji...@apache.org> on 2011/09/28 15:09:45 UTC, 0 replies.
- [jira] [Updated] (NUTCH-1121) JUnit test for parse-js - posted by "Julien Nioche (Updated) (JIRA)" <ji...@apache.org> on 2011/09/28 15:11:46 UTC, 0 replies.
- [jira] [Updated] (NUTCH-1122) JUnit test for protocol-ftp - posted by "Julien Nioche (Updated) (JIRA)" <ji...@apache.org> on 2011/09/28 15:11:46 UTC, 0 replies.
- Jenkins build is back to normal : Nutch-trunk #1617 - posted by Apache Jenkins Server <je...@builds.apache.org> on 2011/09/28 19:27:49 UTC, 0 replies.
- [jira] [Updated] (NUTCH-1039) Fetcher fails for pages without content-length header - posted by "Markus Jelsma (Updated) (JIRA)" <ji...@apache.org> on 2011/09/28 21:42:46 UTC, 0 replies.
- [jira] [Commented] (NUTCH-1088) Write Solr XML documents - posted by "Markus Jelsma (Commented) (JIRA)" <ji...@apache.org> on 2011/09/28 21:44:45 UTC, 0 replies.
- [jira] [Updated] (NUTCH-865) Format source code in unique style - posted by "Markus Jelsma (Updated) (JIRA)" <ji...@apache.org> on 2011/09/28 21:50:45 UTC, 0 replies.
- [jira] [Updated] (NUTCH-1062) Migrate BasicURLNormalizer from Apache ORO to java.util.regex - posted by "Markus Jelsma (Updated) (JIRA)" <ji...@apache.org> on 2011/09/28 21:52:45 UTC, 0 replies.
- [jira] [Updated] (NUTCH-1103) Port protocol-sftp to 1.4 - posted by "Markus Jelsma (Updated) (JIRA)" <ji...@apache.org> on 2011/09/28 21:58:45 UTC, 0 replies.
- [jira] [Commented] (NUTCH-1024) Dynamically set fetchInterval by MIME-type - posted by "Markus Jelsma (Commented) (JIRA)" <ji...@apache.org> on 2011/09/28 22:00:48 UTC, 0 replies.
- [jira] [Updated] (NUTCH-961) Expose Tika's boilerpipe support - posted by "Markus Jelsma (Updated) (JIRA)" <ji...@apache.org> on 2011/09/28 22:00:48 UTC, 0 replies.
- [jira] [Updated] (NUTCH-1107) Log slow parse entries - posted by "Markus Jelsma (Updated) (JIRA)" <ji...@apache.org> on 2011/09/28 22:02:45 UTC, 0 replies.
- [jira] [Updated] (NUTCH-1100) SolrDedup broken - posted by "Markus Jelsma (Updated) (JIRA)" <ji...@apache.org> on 2011/09/28 22:02:45 UTC, 0 replies.
- [jira] [Issue Comment Edited] (NUTCH-1106) Options to skip url's based on length - posted by "Sebastian Nagel (Issue Comment Edited) (JIRA)" <ji...@apache.org> on 2011/09/28 22:28:45 UTC, 0 replies.
- [jira] [Assigned] (NUTCH-1098) better url-normalizer basic - posted by "Markus Jelsma (Assigned) (JIRA)" <ji...@apache.org> on 2011/09/28 22:46:45 UTC, 0 replies.
- [jira] [Created] (NUTCH-1137) LinkDb / invertlinks: command line arguments ignored - posted by "Sebastian Nagel (Created) (JIRA)" <ji...@apache.org> on 2011/09/28 23:50:45 UTC, 0 replies.
- [jira] [Updated] (NUTCH-1137) LinkDb / invertlinks: command line arguments ignored - posted by "Sebastian Nagel (Updated) (JIRA)" <ji...@apache.org> on 2011/09/28 23:54:45 UTC, 1 replies.
- [jira] [Assigned] (NUTCH-1137) LinkDb / invertlinks: command line arguments ignored - posted by "Markus Jelsma (Assigned) (JIRA)" <ji...@apache.org> on 2011/09/29 00:24:45 UTC, 0 replies.
- Build failed in Jenkins: Nutch-trunk #1618 - posted by Apache Jenkins Server <je...@builds.apache.org> on 2011/09/29 06:01:18 UTC, 0 replies.
- Build failed in Jenkins: Nutch-nutchgora #20 - posted by Apache Jenkins Server <je...@builds.apache.org> on 2011/09/29 06:03:12 UTC, 0 replies.
- [jira] [Updated] (NUTCH-672) allow unit tests to be run from bin/nutch - posted by "Lewis John McGibbney (Updated) (JIRA)" <ji...@apache.org> on 2011/09/29 12:41:45 UTC, 2 replies.
- [jira] [Closed] (NUTCH-623) Change plugin source directory "languageidentifier" to "language-identifier" - posted by "Lewis John McGibbney (Closed) (JIRA)" <ji...@apache.org> on 2011/09/29 12:58:45 UTC, 0 replies.
- [jira] [Commented] (NUTCH-672) allow unit tests to be run from bin/nutch - posted by "Julien Nioche (Commented) (JIRA)" <ji...@apache.org> on 2011/09/29 13:12:45 UTC, 2 replies.
- [jira] [Updated] (NUTCH-1046) Add tests for indexing to SOLR - posted by "Julien Nioche (Updated) (JIRA)" <ji...@apache.org> on 2011/09/29 13:14:45 UTC, 0 replies.
- [jira] [Updated] (NUTCH-1064) o.a.n.util.MimeUtil uses deprecated Tika methods - posted by "Julien Nioche (Updated) (JIRA)" <ji...@apache.org> on 2011/09/29 13:16:45 UTC, 0 replies.
- [jira] [Resolved] (NUTCH-672) allow unit tests to be run from bin/nutch - posted by "Lewis John McGibbney (Resolved) (JIRA)" <ji...@apache.org> on 2011/09/29 13:42:45 UTC, 0 replies.
- [jira] [Updated] (NUTCH-1041) Not reading mime-type correctly - posted by "Markus Jelsma (Updated) (JIRA)" <ji...@apache.org> on 2011/09/29 14:06:45 UTC, 0 replies.
- [jira] [Updated] (NUTCH-1061) Migrate MoreIndexingFilter from Apache ORO to java.util.regex - posted by "Markus Jelsma (Updated) (JIRA)" <ji...@apache.org> on 2011/09/29 14:06:45 UTC, 0 replies.
- [jira] [Updated] (NUTCH-1084) ReadDB url throws exception - posted by "Markus Jelsma (Updated) (JIRA)" <ji...@apache.org> on 2011/09/29 14:06:45 UTC, 0 replies.
- [jira] [Updated] (NUTCH-1021) Migrate OutlinkExtractor from Apache ORO to java.util.regex - posted by "Markus Jelsma (Updated) (JIRA)" <ji...@apache.org> on 2011/09/29 14:06:45 UTC, 0 replies.
- [jira] [Updated] (NUTCH-1060) URL filters to produce regexes to be used by OutlinkExtractor. - posted by "Markus Jelsma (Updated) (JIRA)" <ji...@apache.org> on 2011/09/29 14:10:45 UTC, 0 replies.
- [jira] [Updated] (NUTCH-1063) OutlinkExtractor test generates an exception but does not fail - posted by "Markus Jelsma (Updated) (JIRA)" <ji...@apache.org> on 2011/09/29 14:10:45 UTC, 0 replies.
- [jira] [Updated] (NUTCH-1087) Deprecate crawl command and replace with example script - posted by "Markus Jelsma (Updated) (JIRA)" <ji...@apache.org> on 2011/09/29 14:10:45 UTC, 0 replies.
- [jira] [Updated] (NUTCH-1017) Exception getting mime type by name - posted by "Markus Jelsma (Updated) (JIRA)" <ji...@apache.org> on 2011/09/29 14:10:45 UTC, 0 replies.
- [jira] [Updated] (NUTCH-1014) Migrate from Apache ORO to java.util.regex - posted by "Markus Jelsma (Updated) (JIRA)" <ji...@apache.org> on 2011/09/29 14:10:45 UTC, 0 replies.
- [jira] [Commented] (NUTCH-629) Detect slow and timeout servers and drop their URLs - posted by "Lewis John McGibbney (Commented) (JIRA)" <ji...@apache.org> on 2011/09/29 14:18:46 UTC, 1 replies.
- [jira] [Created] (NUTCH-1138) remove LogUtil from trunk and nutch gora - posted by "Lewis John McGibbney (Created) (JIRA)" <ji...@apache.org> on 2011/09/29 14:32:45 UTC, 0 replies.
- [jira] [Commented] (NUTCH-609) Allow Plugins to be Loaded from Jar File(s) - posted by "Lewis John McGibbney (Commented) (JIRA)" <ji...@apache.org> on 2011/09/29 19:51:45 UTC, 0 replies.
- [jira] [Commented] (NUTCH-896) Gora-based tests need to have their own config files - posted by "Lewis John McGibbney (Commented) (JIRA)" <ji...@apache.org> on 2011/09/30 00:03:47 UTC, 0 replies.
- [jira] [Commented] (NUTCH-1135) Fix TestGoraStorage for Nutchgora - posted by "Lewis John McGibbney (Commented) (JIRA)" <ji...@apache.org> on 2011/09/30 00:13:45 UTC, 0 replies.
- [jira] [Commented] (NUTCH-1136) Ant pmd target is broken - posted by "Lewis John McGibbney (Commented) (JIRA)" <ji...@apache.org> on 2011/09/30 00:19:49 UTC, 0 replies.
- [jira] [Updated] (NUTCH-1136) Ant pmd target is broken - posted by "Lewis John McGibbney (Updated) (JIRA)" <ji...@apache.org> on 2011/09/30 00:19:49 UTC, 2 replies.
- [jira] [Commented] (NUTCH-965) Skip parsing for truncated documents - posted by "Lewis John McGibbney (Commented) (JIRA)" <ji...@apache.org> on 2011/09/30 00:23:45 UTC, 0 replies.
- [jira] [Assigned] (NUTCH-1136) Ant pmd target is broken - posted by "Lewis John McGibbney (Assigned) (JIRA)" <ji...@apache.org> on 2011/09/30 00:23:45 UTC, 0 replies.
- [jira] [Commented] (NUTCH-1091) Remove commons logging dependency from Nutch branch and trunk - posted by "Lewis John McGibbney (Commented) (JIRA)" <ji...@apache.org> on 2011/09/30 00:25:45 UTC, 1 replies.
- [jira] [Commented] (NUTCH-1058) Upgrade Solr schema to version 1.4 - posted by "Lewis John McGibbney (Commented) (JIRA)" <ji...@apache.org> on 2011/09/30 00:27:46 UTC, 0 replies.
- Build failed in Jenkins: Nutch-nutchgora #21 - posted by Apache Jenkins Server <je...@builds.apache.org> on 2011/09/30 06:04:51 UTC, 0 replies.
- Jenkins build is back to normal : Nutch-trunk #1619 - posted by Apache Jenkins Server <je...@builds.apache.org> on 2011/09/30 06:23:47 UTC, 0 replies.
- [jira] [Updated] (NUTCH-809) Parse-metatags plugin - posted by "Elisabeth Adler (Updated) (JIRA)" <ji...@apache.org> on 2011/09/30 08:11:45 UTC, 1 replies.
- 1.4 release - newer hadoop jars - posted by Radim Kolar <hs...@sendmail.cz> on 2011/09/30 09:06:33 UTC, 1 replies.
- [jira] [Closed] (NUTCH-1091) Remove commons logging dependency from Nutch branch and trunk - posted by "Lewis John McGibbney (Closed) (JIRA)" <ji...@apache.org> on 2011/09/30 12:45:45 UTC, 0 replies.
- [jira] [Resolved] (NUTCH-1091) Remove commons logging dependency from Nutch branch and trunk - posted by "Lewis John McGibbney (Resolved) (JIRA)" <ji...@apache.org> on 2011/09/30 12:45:45 UTC, 0 replies.
- [jira] [Assigned] (NUTCH-865) Format source code in unique style - posted by "Lewis John McGibbney (Assigned) (JIRA)" <ji...@apache.org> on 2011/09/30 12:55:45 UTC, 0 replies.
- [jira] [Commented] (NUTCH-809) Parse-metatags plugin - posted by "Julien Nioche (Commented) (JIRA)" <ji...@apache.org> on 2011/09/30 13:52:46 UTC, 0 replies.
- Choosing an efficient family configuration for GORA HBase - posted by Ferdy Galema <fe...@kalooga.com> on 2011/09/30 14:57:12 UTC, 0 replies.
- [jira] [Created] (NUTCH-1139) Indexer to delete documents - posted by "Markus Jelsma (Created) (JIRA)" <ji...@apache.org> on 2011/09/30 16:08:46 UTC, 0 replies.
- [jira] [Resolved] (NUTCH-1052) Multiple deletes of the same URL using SolrClean - posted by "Markus Jelsma (Resolved) (JIRA)" <ji...@apache.org> on 2011/09/30 16:10:45 UTC, 0 replies.
- [jira] [Updated] (NUTCH-1139) Indexer to delete documents - posted by "Markus Jelsma (Updated) (JIRA)" <ji...@apache.org> on 2011/09/30 17:05:45 UTC, 0 replies.
- [jira] [Updated] (NUTCH-1140) index-more plugin, resetTitle method creates multiple values in the Title field - posted by "Joe Liedtke (Updated) (JIRA)" <ji...@apache.org> on 2011/09/30 21:05:45 UTC, 2 replies.
- [jira] [Created] (NUTCH-1140) index-more plugin, resetTitle method creates multiple values in the Title field - posted by "Joe Liedtke (Created) (JIRA)" <ji...@apache.org> on 2011/09/30 21:05:45 UTC, 0 replies.
- [jira] [Assigned] (NUTCH-1135) Fix TestGoraStorage for Nutchgora - posted by "Lewis John McGibbney (Assigned) (JIRA)" <ji...@apache.org> on 2011/09/30 21:47:46 UTC, 0 replies.