You are viewing a plain text version of this content. The canonical link for it is here.
- Build failed in Jenkins: Nutch-nutchgora #365 - posted by Apache Jenkins Server <je...@builds.apache.org> on 2012/10/01 06:06:14 UTC, 0 replies.
- Build failed in Jenkins: Nutch-trunk #1974 - posted by Apache Jenkins Server <je...@builds.apache.org> on 2012/10/01 06:07:25 UTC, 0 replies.
- [jira] [Commented] (NUTCH-827) HTTP POST Authentication - posted by "Max Dzyuba (JIRA)" <ji...@apache.org> on 2012/10/01 10:13:12 UTC, 9 replies.
- [jira] [Comment Edited] (NUTCH-827) HTTP POST Authentication - posted by "Jasper van Veghel (JIRA)" <ji...@apache.org> on 2012/10/01 13:41:08 UTC, 1 replies.
- [PING] [VOTE] Apache Nutch 2.1 Release Candidate Available - posted by Lewis John Mcgibbney <le...@gmail.com> on 2012/10/01 14:17:31 UTC, 2 replies.
- Re: [VOTE] Apache Nutch 2.1 Release Candidate Available - posted by Julien Nioche <li...@gmail.com> on 2012/10/01 14:36:06 UTC, 7 replies.
- [jira] [Updated] (NUTCH-1467) nutch 1.5.1 not able to parse mutliValued metatags - posted by "kiran (JIRA)" <ji...@apache.org> on 2012/10/01 20:39:07 UTC, 2 replies.
- [jira] [Comment Edited] (NUTCH-1467) nutch 1.5.1 not able to parse mutliValued metatags - posted by "kiran (JIRA)" <ji...@apache.org> on 2012/10/01 20:41:07 UTC, 1 replies.
- Jenkins build is back to normal : Nutch-nutchgora #366 - posted by Apache Jenkins Server <je...@builds.apache.org> on 2012/10/02 06:21:32 UTC, 0 replies.
- Jenkins build is back to normal : Nutch-trunk #1975 - posted by Apache Jenkins Server <je...@builds.apache.org> on 2012/10/02 06:34:18 UTC, 0 replies.
- [jira] [Commented] (NUTCH-585) [PARSE-HTML plugin] Block certain parts of HTML code from being indexed - posted by "Iwan Luijks (JIRA)" <ji...@apache.org> on 2012/10/02 11:47:08 UTC, 1 replies.
- [jira] [Commented] (NUTCH-1467) nutch 1.5.1 not able to parse mutliValued metatags - posted by "kiran (JIRA)" <ji...@apache.org> on 2012/10/02 12:01:08 UTC, 3 replies.
- [jira] [Commented] (NUTCH-706) Url regex normalizer - posted by "Sebastian Nagel (JIRA)" <ji...@apache.org> on 2012/10/02 21:19:08 UTC, 1 replies.
- Build failed in Jenkins: Nutch-nutchgora #368 - posted by Apache Jenkins Server <je...@builds.apache.org> on 2012/10/04 06:06:17 UTC, 0 replies.
- Build failed in Jenkins: Nutch-trunk #1977 - posted by Apache Jenkins Server <je...@builds.apache.org> on 2012/10/04 06:07:48 UTC, 0 replies.
- Re: patches to parse-metatag plugin to save mutliValues - posted by Lewis John Mcgibbney <le...@gmail.com> on 2012/10/04 18:09:30 UTC, 6 replies.
- [RESULT] Was Re: [PING] [VOTE] Apache Nutch 2.1 Release Candidate Available - posted by Lewis John Mcgibbney <le...@gmail.com> on 2012/10/04 18:19:12 UTC, 0 replies.
- Updating Nutch Crawled index. - posted by "atuldj.jadhav" <at...@gmail.com> on 2012/10/04 20:26:20 UTC, 0 replies.
- Jenkins build is back to normal : Nutch-nutchgora #369 - posted by Apache Jenkins Server <je...@builds.apache.org> on 2012/10/05 06:37:44 UTC, 0 replies.
- Jenkins build is back to normal : Nutch-trunk #1978 - posted by Apache Jenkins Server <je...@builds.apache.org> on 2012/10/05 06:51:18 UTC, 0 replies.
- [ANNOUNCE] Apache Nutch 2.1 Released - posted by lewis john mcgibbney <le...@apache.org> on 2012/10/05 17:12:10 UTC, 3 replies.
- [jira] [Created] (NUTCH-1475) Nutch 2.1 Index-More Plugin -- A better fall back value for date field - posted by "James Sullivan (JIRA)" <ji...@apache.org> on 2012/10/07 03:51:02 UTC, 0 replies.
- [jira] [Updated] (NUTCH-1475) Nutch 2.1 Index-More Plugin -- A better fall back value for date field - posted by "James Sullivan (JIRA)" <ji...@apache.org> on 2012/10/07 03:51:02 UTC, 3 replies.
- Build failed in Jenkins: Nutch-nutchgora #371 - posted by Apache Jenkins Server <je...@builds.apache.org> on 2012/10/07 06:07:40 UTC, 0 replies.
- Build failed in Jenkins: Nutch-trunk #1980 - posted by Apache Jenkins Server <je...@builds.apache.org> on 2012/10/07 06:09:28 UTC, 0 replies.
- Build failed in Jenkins: Nutch-nutchgora #372 - posted by Apache Jenkins Server <je...@builds.apache.org> on 2012/10/08 06:07:07 UTC, 0 replies.
- Build failed in Jenkins: Nutch-trunk #1981 - posted by Apache Jenkins Server <je...@builds.apache.org> on 2012/10/08 06:08:26 UTC, 0 replies.
- [jira] [Commented] (NUTCH-1457) Nutch2 Refactor the update process so that fetched items are only processed once - posted by "Ferdy Galema (JIRA)" <ji...@apache.org> on 2012/10/08 12:10:03 UTC, 0 replies.
- [jira] [Updated] (NUTCH-1476) SegmentReader getStats should set parsed = -1 if no parsing took place - posted by "Sebastian Nagel (JIRA)" <ji...@apache.org> on 2012/10/09 00:08:02 UTC, 0 replies.
- [jira] [Created] (NUTCH-1476) SegmentReader getStats should set parsed = -1 if no parsing took place - posted by "Sebastian Nagel (JIRA)" <ji...@apache.org> on 2012/10/09 00:08:02 UTC, 0 replies.
- [jira] [Assigned] (NUTCH-1252) SegmentReader -get shows wrong data - posted by "Sebastian Nagel (JIRA)" <ji...@apache.org> on 2012/10/09 00:10:03 UTC, 0 replies.
- [jira] [Commented] (NUTCH-1344) BasicURLNormalizer to normalize https same as http - posted by "Sebastian Nagel (JIRA)" <ji...@apache.org> on 2012/10/09 00:20:02 UTC, 5 replies.
- Jenkins build is back to normal : Nutch-nutchgora #373 - posted by Apache Jenkins Server <je...@builds.apache.org> on 2012/10/09 06:22:54 UTC, 0 replies.
- Jenkins build is back to normal : Nutch-trunk #1982 - posted by Apache Jenkins Server <je...@builds.apache.org> on 2012/10/09 06:32:57 UTC, 0 replies.
- Nutch 2.x architecture Supporting multivalues - posted by kiran chitturi <ch...@gmail.com> on 2012/10/10 21:41:52 UTC, 0 replies.
- [jira] [Updated] (NUTCH-706) Url regex normalizer: default pattern for session id removal not to match "newsId" - posted by "Sebastian Nagel (JIRA)" <ji...@apache.org> on 2012/10/10 22:43:04 UTC, 0 replies.
- [jira] [Resolved] (NUTCH-706) Url regex normalizer: default pattern for session id removal not to match "newsId" - posted by "Sebastian Nagel (JIRA)" <ji...@apache.org> on 2012/10/10 23:15:02 UTC, 0 replies.
- [jira] [Resolved] (NUTCH-1344) BasicURLNormalizer to normalize https same as http - posted by "Sebastian Nagel (JIRA)" <ji...@apache.org> on 2012/10/10 23:21:03 UTC, 0 replies.
- [jira] [Commented] (NUTCH-706) Url regex normalizer: default pattern for session id removal not to match "newsId" - posted by "Sebastian Nagel (JIRA)" <ji...@apache.org> on 2012/10/10 23:59:03 UTC, 2 replies.
- [jira] [Updated] (NUTCH-874) Make sure all plugins in src/plugin are compatible with Nutch 2.0 and Gora - posted by "Lewis John McGibbney (JIRA)" <ji...@apache.org> on 2012/10/11 00:57:03 UTC, 0 replies.
- [jira] [Commented] (NUTCH-874) Make sure all plugins in src/plugin are compatible with Nutch 2.0 and Gora - posted by "Lewis John McGibbney (JIRA)" <ji...@apache.org> on 2012/10/11 01:05:04 UTC, 1 replies.
- [jira] [Created] (NUTCH-1477) NPE when injecting with DataFileAvroStore - posted by "Mike Baranczak (JIRA)" <ji...@apache.org> on 2012/10/11 03:13:03 UTC, 0 replies.
- [jira] [Commented] (NUTCH-1477) NPE when injecting with DataFileAvroStore - posted by "Mike Baranczak (JIRA)" <ji...@apache.org> on 2012/10/11 03:23:03 UTC, 6 replies.
- Jenkins build is back to normal : Nutch-nutchgora #375 - posted by Apache Jenkins Server <je...@builds.apache.org> on 2012/10/11 06:18:46 UTC, 0 replies.
- Jenkins build is back to normal : Nutch-trunk #1984 - posted by Apache Jenkins Server <je...@builds.apache.org> on 2012/10/11 06:25:10 UTC, 0 replies.
- [jira] [Commented] (NUTCH-1475) Nutch 2.1 Index-More Plugin -- A better fall back value for date field - posted by "Lewis John McGibbney (JIRA)" <ji...@apache.org> on 2012/10/11 15:49:03 UTC, 3 replies.
- [jira] [Resolved] (NUTCH-1252) SegmentReader -get shows wrong data - posted by "Sebastian Nagel (JIRA)" <ji...@apache.org> on 2012/10/11 22:23:02 UTC, 0 replies.
- [jira] [Resolved] (NUTCH-1476) SegmentReader getStats should set parsed = -1 if no parsing took place - posted by "Sebastian Nagel (JIRA)" <ji...@apache.org> on 2012/10/11 22:45:03 UTC, 0 replies.
- [jira] [Resolved] (NUTCH-1383) IndexingFiltersChecker to show error message instead of null pointer exception - posted by "Sebastian Nagel (JIRA)" <ji...@apache.org> on 2012/10/11 23:07:03 UTC, 0 replies.
- [jira] [Commented] (NUTCH-1476) SegmentReader getStats should set parsed = -1 if no parsing took place - posted by "Hudson (JIRA)" <ji...@apache.org> on 2012/10/11 23:33:03 UTC, 1 replies.
- [jira] [Commented] (NUTCH-1252) SegmentReader -get shows wrong data - posted by "Hudson (JIRA)" <ji...@apache.org> on 2012/10/11 23:33:03 UTC, 1 replies.
- [jira] [Commented] (NUTCH-1383) IndexingFiltersChecker to show error message instead of null pointer exception - posted by "Hudson (JIRA)" <ji...@apache.org> on 2012/10/11 23:33:03 UTC, 1 replies.
- [jira] [Commented] (NUTCH-1433) Upgrade to Tika 1.2 - posted by "kiran (JIRA)" <ji...@apache.org> on 2012/10/16 01:17:03 UTC, 2 replies.
- [jira] [Commented] (NUTCH-710) Support for rel="canonical" attribute - posted by "Iwan Luijks (JIRA)" <ji...@apache.org> on 2012/10/17 09:44:03 UTC, 2 replies.
- [jira] [Comment Edited] (NUTCH-710) Support for rel="canonical" attribute - posted by "Iwan Luijks (JIRA)" <ji...@apache.org> on 2012/10/17 13:54:04 UTC, 0 replies.
- Build failed in Jenkins: Nutch-nutchgora #382 - posted by Apache Jenkins Server <je...@builds.apache.org> on 2012/10/18 11:55:33 UTC, 0 replies.
- [jira] [Created] (NUTCH-1478) Parse-metatags and index-metadata plugin for Nutch 2.x series - posted by "kiran (JIRA)" <ji...@apache.org> on 2012/10/18 23:10:04 UTC, 0 replies.
- [jira] [Updated] (NUTCH-1478) Parse-metatags and index-metadata plugin for Nutch 2.x series - posted by "kiran (JIRA)" <ji...@apache.org> on 2012/10/18 23:12:04 UTC, 5 replies.
- Jenkins build is back to normal : Nutch-nutchgora #383 - posted by Apache Jenkins Server <je...@builds.apache.org> on 2012/10/19 10:52:27 UTC, 0 replies.
- [jira] [Created] (NUTCH-1479) nutch readhostdb and updatehostdb do not work with MySQL - posted by "James Sullivan (JIRA)" <ji...@apache.org> on 2012/10/20 00:18:12 UTC, 0 replies.
- [jira] [Resolved] (NUTCH-1087) Deprecate crawl command and replace with example script - posted by "Julien Nioche (JIRA)" <ji...@apache.org> on 2012/10/20 10:52:12 UTC, 0 replies.
- [jira] [Resolved] (NUTCH-1433) Upgrade to Tika 1.2 - posted by "Julien Nioche (JIRA)" <ji...@apache.org> on 2012/10/20 11:16:12 UTC, 0 replies.
- [jira] [Commented] (NUTCH-1087) Deprecate crawl command and replace with example script - posted by "Hudson (JIRA)" <ji...@apache.org> on 2012/10/21 06:26:14 UTC, 0 replies.
- [jira] [Closed] (NUTCH-1479) nutch readhostdb and updatehostdb do not work with MySQL - posted by "James Sullivan (JIRA)" <ji...@apache.org> on 2012/10/22 02:46:12 UTC, 0 replies.
- [jira] [Commented] (NUTCH-1377) Add option to index via CloudSolrServer instead - posted by "Markus Jelsma (JIRA)" <ji...@apache.org> on 2012/10/22 16:46:14 UTC, 0 replies.
- [jira] [Created] (NUTCH-1480) SolrIndexer to write to multiple servers. - posted by "Markus Jelsma (JIRA)" <ji...@apache.org> on 2012/10/22 16:50:12 UTC, 0 replies.
- [jira] [Commented] (NUTCH-1422) reset signature for redirects - posted by "Markus Jelsma (JIRA)" <ji...@apache.org> on 2012/10/23 11:35:12 UTC, 1 replies.
- [jira] [Resolved] (NUTCH-1215) UpdateDB should not require segment as input - posted by "Markus Jelsma (JIRA)" <ji...@apache.org> on 2012/10/23 11:47:11 UTC, 0 replies.
- [jira] [Commented] (NUTCH-1341) NotModified time set to now but page not modified - posted by "Markus Jelsma (JIRA)" <ji...@apache.org> on 2012/10/23 11:51:12 UTC, 3 replies.
- [jira] [Commented] (NUTCH-1215) UpdateDB should not require segment as input - posted by "Hudson (JIRA)" <ji...@apache.org> on 2012/10/23 12:43:11 UTC, 1 replies.
- [jira] [Resolved] (NUTCH-1341) NotModified time set to now but page not modified - posted by "Markus Jelsma (JIRA)" <ji...@apache.org> on 2012/10/23 15:29:12 UTC, 0 replies.
- [jira] [Resolved] (NUTCH-1421) RegexURLNormalizer to only skip rules with invalid patterns - posted by "Sebastian Nagel (JIRA)" <ji...@apache.org> on 2012/10/23 22:55:13 UTC, 0 replies.
- [jira] [Commented] (NUTCH-1421) RegexURLNormalizer to only skip rules with invalid patterns - posted by "Hudson (JIRA)" <ji...@apache.org> on 2012/10/23 23:20:12 UTC, 2 replies.
- [jira] [Assigned] (NUTCH-1480) SolrIndexer to write to multiple servers. - posted by "Markus Jelsma (JIRA)" <ji...@apache.org> on 2012/10/24 12:54:13 UTC, 0 replies.
- [jira] [Work stopped] (NUTCH-1480) SolrIndexer to write to multiple servers. - posted by "Markus Jelsma (JIRA)" <ji...@apache.org> on 2012/10/24 12:54:14 UTC, 0 replies.
- [jira] [Work started] (NUTCH-1480) SolrIndexer to write to multiple servers. - posted by "Markus Jelsma (JIRA)" <ji...@apache.org> on 2012/10/24 12:54:14 UTC, 0 replies.
- [jira] [Updated] (NUTCH-1480) SolrIndexer to write to multiple servers. - posted by "Markus Jelsma (JIRA)" <ji...@apache.org> on 2012/10/24 13:00:13 UTC, 0 replies.
- Build failed in Jenkins: Nutch-nutchgora #387 - posted by Apache Jenkins Server <je...@builds.apache.org> on 2012/10/25 01:26:20 UTC, 0 replies.
- [jira] [Updated] (NUTCH-1477) NPE when injecting with DataFileAvroStore - posted by "Julien Nioche (JIRA)" <ji...@apache.org> on 2012/10/25 15:49:13 UTC, 2 replies.
- misbehaving crawler - posted by Alex diNorcia <al...@dinorcia.net> on 2012/10/25 17:59:07 UTC, 2 replies.
- [jira] [Created] (NUTCH-1481) When using MySQL as storage unicode characters within URLS cause nutch to fail - posted by "Arni Sumarlidason (JIRA)" <ji...@apache.org> on 2012/10/26 03:31:12 UTC, 0 replies.
- Jenkins build is back to normal : Nutch-nutchgora #388 - posted by Apache Jenkins Server <je...@builds.apache.org> on 2012/10/26 06:23:43 UTC, 0 replies.
- Unsubscription - posted by Nishikawa, Alfonso <an...@indra.es> on 2012/10/26 08:18:17 UTC, 0 replies.
- [jira] [Commented] (NUTCH-1481) When using MySQL as storage unicode characters within URLS cause nutch to fail - posted by "Lewis John McGibbney (JIRA)" <ji...@apache.org> on 2012/10/26 15:31:12 UTC, 2 replies.
- [jira] [Updated] (NUTCH-585) [PARSE-HTML plugin] Block certain parts of HTML code from being indexed - posted by "Roberto Gardenier (JIRA)" <ji...@apache.org> on 2012/10/29 15:20:12 UTC, 0 replies.
- [jira] [Created] (NUTCH-1482) Rename HTMLParseFilter - posted by "Julien Nioche (JIRA)" <ji...@apache.org> on 2012/10/29 17:00:16 UTC, 0 replies.
- [jira] [Commented] (NUTCH-1482) Rename HTMLParseFilter - posted by "Lewis John McGibbney (JIRA)" <ji...@apache.org> on 2012/10/29 17:04:13 UTC, 4 replies.
- [jira] [Updated] (NUTCH-1245) URL gone with 404 after db.fetch.interval.max stays db_unfetched in CrawlDb and is generated over and over again - posted by "Sebastian Nagel (JIRA)" <ji...@apache.org> on 2012/10/29 17:16:12 UTC, 2 replies.
- [jira] [Assigned] (NUTCH-1370) Expose exact number of urls injected @runtime - posted by "Lewis John McGibbney (JIRA)" <ji...@apache.org> on 2012/10/29 17:20:14 UTC, 0 replies.
- NUTCH-1370 - posted by Lewis John Mcgibbney <le...@gmail.com> on 2012/10/29 17:22:42 UTC, 5 replies.
- [jira] [Updated] (NUTCH-578) URL fetched with 403 is generated over and over again - posted by "Sebastian Nagel (JIRA)" <ji...@apache.org> on 2012/10/30 00:20:15 UTC, 0 replies.
- [jira] [Commented] (NUTCH-578) URL fetched with 403 is generated over and over again - posted by "Sebastian Nagel (JIRA)" <ji...@apache.org> on 2012/10/30 00:20:15 UTC, 1 replies.
- Build failed in Jenkins: Nutch-nutchgora #392 - posted by Apache Jenkins Server <je...@builds.apache.org> on 2012/10/30 11:49:27 UTC, 0 replies.
- Build failed in Jenkins: Nutch-trunk #2001 - posted by Apache Jenkins Server <je...@builds.apache.org> on 2012/10/30 15:26:24 UTC, 0 replies.
- [jira] [Commented] (NUTCH-1370) Expose exact number of urls injected @runtime - posted by "Sebastian Nagel (JIRA)" <ji...@apache.org> on 2012/10/30 23:32:13 UTC, 1 replies.
- [jira] [Created] (NUTCH-1483) Can't crawl filesystem with protocol-file plugin - posted by "Rogério Pereira Araújo (JIRA)" <ji...@apache.org> on 2012/10/31 17:15:12 UTC, 0 replies.
- [jira] [Updated] (NUTCH-1483) Can't crawl filesystem with protocol-file plugin - posted by "Rogério Pereira Araújo (JIRA)" <ji...@apache.org> on 2012/10/31 17:17:11 UTC, 4 replies.
- [jira] [Commented] (NUTCH-1483) Can't crawl filesystem with protocol-file plugin - posted by "Sebastian Nagel (JIRA)" <ji...@apache.org> on 2012/10/31 20:36:12 UTC, 4 replies.