You are viewing a plain text version of this content. The canonical link for it is here.
- [jira] [Comment Edited] (NUTCH-1645) Junit Test Case for Adaptive Fetch Schedule class - posted by "Sebastian Nagel (JIRA)" <ji...@apache.org> on 2014/02/01 17:50:11 UTC, 0 replies.
- [jira] [Created] (NUTCH-1722) FetcherJob#fetch throws NullPointerException for null batchId - posted by "Gerhard Gossen (JIRA)" <ji...@apache.org> on 2014/02/03 17:24:10 UTC, 0 replies.
- [jira] [Updated] (NUTCH-1722) FetcherJob#fetch throws NullPointerException for null batchId - posted by "Gerhard Gossen (JIRA)" <ji...@apache.org> on 2014/02/03 17:24:11 UTC, 0 replies.
- [jira] [Commented] (NUTCH-1721) Upgrade to Crawler commons 0.3 - posted by "Lewis John McGibbney (JIRA)" <ji...@apache.org> on 2014/02/03 18:58:09 UTC, 2 replies.
- [jira] [Commented] (NUTCH-1645) Junit Test Case for Adaptive Fetch Schedule class - posted by "Lewis John McGibbney (JIRA)" <ji...@apache.org> on 2014/02/03 18:58:11 UTC, 0 replies.
- [jira] [Commented] (NUTCH-1722) FetcherJob#fetch throws NullPointerException for null batchId - posted by "Lewis John McGibbney (JIRA)" <ji...@apache.org> on 2014/02/03 19:02:10 UTC, 3 replies.
- [jira] [Commented] (NUTCH-1478) Parse-metatags and index-metadata plugin for Nutch 2.x series - posted by "Anton (JIRA)" <ji...@apache.org> on 2014/02/04 10:22:11 UTC, 8 replies.
- [jira] [Comment Edited] (NUTCH-1478) Parse-metatags and index-metadata plugin for Nutch 2.x series - posted by "Anton (JIRA)" <ji...@apache.org> on 2014/02/04 10:24:11 UTC, 7 replies.
- [jira] [Updated] (NUTCH-1371) Replace Ivy with Maven Ant tasks - posted by "Alparslan Avcı (JIRA)" <ji...@apache.org> on 2014/02/04 13:10:10 UTC, 0 replies.
- [jira] [Commented] (NUTCH-1371) Replace Ivy with Maven Ant tasks - posted by "Julien Nioche (JIRA)" <ji...@apache.org> on 2014/02/04 13:40:11 UTC, 0 replies.
- [jira] [Commented] (NUTCH-1711) Normalizer does not encode exclamation mark - posted by "Sebastian Nagel (JIRA)" <ji...@apache.org> on 2014/02/04 20:00:14 UTC, 0 replies.
- [jira] [Created] (NUTCH-1723) nutch updatedb fails due to avro (de)serialization issues on images - posted by "Koen Smets (JIRA)" <ji...@apache.org> on 2014/02/04 21:08:11 UTC, 0 replies.
- [jira] [Commented] (NUTCH-1723) nutch updatedb fails due to avro (de)serialization issues on images - posted by "Koen Smets (JIRA)" <ji...@apache.org> on 2014/02/04 21:22:12 UTC, 2 replies.
- [jira] [Commented] (NUTCH-1556) enabling updatedb to accept batchId - posted by "Koen Smets (JIRA)" <ji...@apache.org> on 2014/02/05 14:08:10 UTC, 2 replies.
- [jira] [Comment Edited] (NUTCH-1556) enabling updatedb to accept batchId - posted by "Koen Smets (JIRA)" <ji...@apache.org> on 2014/02/05 14:10:12 UTC, 0 replies.
- [jira] [Commented] (NUTCH-1679) UpdateDb using batchId, link may override crawled page. - posted by "Koen Smets (JIRA)" <ji...@apache.org> on 2014/02/05 19:50:16 UTC, 2 replies.
- [jira] [Comment Edited] (NUTCH-1679) UpdateDb using batchId, link may override crawled page. - posted by "Koen Smets (JIRA)" <ji...@apache.org> on 2014/02/05 19:56:12 UTC, 0 replies.
- [jira] [Updated] (NUTCH-1286) Refactoring/reimplementing crawling API (NutchApp) - posted by "Lewis John McGibbney (JIRA)" <ji...@apache.org> on 2014/02/06 01:50:09 UTC, 0 replies.
- [jira] [Updated] (NUTCH-841) Create a Wicket-based Web Application for Nutch - posted by "Lewis John McGibbney (JIRA)" <ji...@apache.org> on 2014/02/06 01:50:14 UTC, 0 replies.
- Re: [REQUEST] Integrate Wicket and Nutch for Google Summer of Code 2014 - posted by Martin Grigorov <mg...@apache.org> on 2014/02/06 09:19:33 UTC, 3 replies.
- [jira] [Comment Edited] (NUTCH-1723) nutch updatedb fails due to avro (de)serialization issues on images - posted by "Lewis John McGibbney (JIRA)" <ji...@apache.org> on 2014/02/06 20:52:11 UTC, 1 replies.
- [jira] [Commented] (NUTCH-710) Support for rel="canonical" attribute - posted by "Joshua Norris (JIRA)" <ji...@apache.org> on 2014/02/07 01:52:20 UTC, 3 replies.
- [jira] [Resolved] (NUTCH-1721) Upgrade to Crawler commons 0.3 - posted by "Tejas Patil (JIRA)" <ji...@apache.org> on 2014/02/09 10:11:19 UTC, 0 replies.
- Build failed in Jenkins: Nutch-trunk #2524 - posted by Apache Jenkins Server <je...@builds.apache.org> on 2014/02/10 05:03:13 UTC, 0 replies.
- [jira] [Created] (NUTCH-1724) LinkDBReader to support regex output filtering - posted by "Markus Jelsma (JIRA)" <ji...@apache.org> on 2014/02/10 11:59:19 UTC, 0 replies.
- [jira] [Updated] (NUTCH-1724) LinkDBReader to support regex output filtering - posted by "Markus Jelsma (JIRA)" <ji...@apache.org> on 2014/02/10 11:59:19 UTC, 0 replies.
- [jira] [Commented] (NUTCH-1707) DummyIndexingWriter - posted by "Markus Jelsma (JIRA)" <ji...@apache.org> on 2014/02/10 12:01:19 UTC, 2 replies.
- [jira] [Resolved] (NUTCH-1707) DummyIndexingWriter - posted by "Markus Jelsma (JIRA)" <ji...@apache.org> on 2014/02/10 12:47:20 UTC, 0 replies.
- Jenkins build is back to normal : Nutch-trunk #2525 - posted by Apache Jenkins Server <je...@builds.apache.org> on 2014/02/10 13:52:50 UTC, 0 replies.
- [jira] [Created] (NUTCH-1725) CleaningJob's reducer does not commit deleted docs. - posted by "İlhami KALKAN (JIRA)" <ji...@apache.org> on 2014/02/11 14:56:20 UTC, 0 replies.
- [jira] [Closed] (NUTCH-1725) CleaningJob's reducer does not commit deleted docs. - posted by "İlhami KALKAN (JIRA)" <ji...@apache.org> on 2014/02/11 15:09:20 UTC, 0 replies.
- [jira] [Comment Edited] (NUTCH-1662) Indexer Plugin for Solr Cloud - posted by "Yasin Kılınç (JIRA)" <ji...@apache.org> on 2014/02/12 10:25:19 UTC, 1 replies.
- [jira] [Commented] (NUTCH-1718) update description of property http.robots.agent - posted by "Sebastian Nagel (JIRA)" <ji...@apache.org> on 2014/02/12 12:47:20 UTC, 0 replies.
- [jira] [Created] (NUTCH-1726) HeadingsFilter does not find nested nodes - posted by "Markus Jelsma (JIRA)" <ji...@apache.org> on 2014/02/12 13:09:19 UTC, 0 replies.
- [jira] [Updated] (NUTCH-1726) HeadingsFilter does not find nested nodes - posted by "Markus Jelsma (JIRA)" <ji...@apache.org> on 2014/02/12 13:11:19 UTC, 2 replies.
- [jira] [Commented] (NUTCH-961) Expose Tika's boilerpipe support - posted by "Matzz (JIRA)" <ji...@apache.org> on 2014/02/12 13:13:22 UTC, 1 replies.
- Is it possible to run Nutch 2.x with httpclient 3 and 4 simultaneously? - posted by d_k <ma...@gmail.com> on 2014/02/12 15:17:27 UTC, 0 replies.
- Re: [DISCUSS] Release Trunk - posted by Julien Nioche <li...@gmail.com> on 2014/02/12 16:33:12 UTC, 5 replies.
- [jira] [Created] (NUTCH-1727) Length of the Tlds - posted by "Sertac TURKEL (JIRA)" <ji...@apache.org> on 2014/02/12 17:40:19 UTC, 0 replies.
- [jira] [Updated] (NUTCH-1727) Length of the Tlds - posted by "Sertac TURKEL (JIRA)" <ji...@apache.org> on 2014/02/12 17:50:19 UTC, 5 replies.
- [jira] [Commented] (NUTCH-1727) Length of the Tlds - posted by "Lewis John McGibbney (JIRA)" <ji...@apache.org> on 2014/02/13 09:30:23 UTC, 1 replies.
- [jira] [Reopened] (NUTCH-1725) CleaningJob's reducer does not commit deleted docs. - posted by "İlhami KALKAN (JIRA)" <ji...@apache.org> on 2014/02/13 11:03:22 UTC, 0 replies.
- [jira] [Updated] (NUTCH-1725) CleaningJob's reducer does not commit deleted docs. - posted by "İlhami KALKAN (JIRA)" <ji...@apache.org> on 2014/02/13 14:18:19 UTC, 0 replies.
- [jira] [Created] (NUTCH-1728) indexer-solr plugin is not delete docs from solr - posted by "İlhami KALKAN (JIRA)" <ji...@apache.org> on 2014/02/13 14:30:38 UTC, 0 replies.
- [jira] [Updated] (NUTCH-1728) indexer-solr plugin is not delete docs from solr - posted by "İlhami KALKAN (JIRA)" <ji...@apache.org> on 2014/02/13 14:32:19 UTC, 0 replies.
- [jira] [Commented] (NUTCH-1726) HeadingsFilter does not find nested nodes - posted by "Markus Jelsma (JIRA)" <ji...@apache.org> on 2014/02/13 15:04:19 UTC, 3 replies.
- [jira] [Updated] (NUTCH-1525) Generator to record external links even when db.ignore.external.links set to true - posted by "Dmitry Cherniachenko (JIRA)" <ji...@apache.org> on 2014/02/14 10:26:19 UTC, 0 replies.
- [jira] [Commented] (NUTCH-1525) Generator to record external links even when db.ignore.external.links set to true - posted by "Lewis John McGibbney (JIRA)" <ji...@apache.org> on 2014/02/14 12:02:19 UTC, 0 replies.
- [jira] [Updated] (NUTCH-1706) IndexerMapReduce does not remove db_redir_temp etc - posted by "Sebastian Nagel (JIRA)" <ji...@apache.org> on 2014/02/18 11:03:20 UTC, 0 replies.
- [jira] [Commented] (NUTCH-1706) IndexerMapReduce does not remove db_redir_temp etc - posted by "Sebastian Nagel (JIRA)" <ji...@apache.org> on 2014/02/18 11:03:20 UTC, 6 replies.
- Getting statistics about crawled pages - posted by Alparslan Avcı <al...@agmlab.com> on 2014/02/19 14:07:26 UTC, 2 replies.
- [jira] [Created] (NUTCH-1729) Upgrade to Tika 1.5 - posted by "Julien Nioche (JIRA)" <ji...@apache.org> on 2014/02/20 10:10:20 UTC, 0 replies.
- [jira] [Updated] (NUTCH-1729) Upgrade to Tika 1.5 - posted by "Julien Nioche (JIRA)" <ji...@apache.org> on 2014/02/20 10:24:19 UTC, 1 replies.
- [jira] [Commented] (NUTCH-1113) Merging segments causes URLs to vanish from crawldb/index? - posted by "Sebastian Nagel (JIRA)" <ji...@apache.org> on 2014/02/20 17:45:19 UTC, 5 replies.
- Build failed in Jenkins: Nutch-trunk #2536 - posted by Apache Jenkins Server <je...@builds.apache.org> on 2014/02/21 05:02:41 UTC, 0 replies.
- Common Crawl's Move to Apache Nutch - posted by Julien Nioche <li...@gmail.com> on 2014/02/21 09:51:09 UTC, 1 replies.
- Nutch roadmap and documentation - posted by Mateusz Zakarczemny <ma...@up2data.pl> on 2014/02/21 11:29:46 UTC, 3 replies.
- [jira] [Commented] (NUTCH-1729) Upgrade to Tika 1.5 - posted by "Markus Jelsma (JIRA)" <ji...@apache.org> on 2014/02/21 13:02:19 UTC, 7 replies.
- [jira] [Resolved] (NUTCH-1729) Upgrade to Tika 1.5 - posted by "Julien Nioche (JIRA)" <ji...@apache.org> on 2014/02/21 13:08:20 UTC, 0 replies.
- [jira] [Updated] (NUTCH-1113) Merging segments causes URLs to vanish from crawldb/index? - posted by "Markus Jelsma (JIRA)" <ji...@apache.org> on 2014/02/21 13:12:19 UTC, 2 replies.
- Jenkins build is back to normal : Nutch-trunk #2537 - posted by Apache Jenkins Server <je...@builds.apache.org> on 2014/02/21 13:51:55 UTC, 0 replies.
- [jira] [Created] (NUTCH-1730) Scoring-depth optionally not to increment depth for external hosts - posted by "Markus Jelsma (JIRA)" <ji...@apache.org> on 2014/02/21 17:58:19 UTC, 0 replies.
- [jira] [Updated] (NUTCH-1730) Scoring-depth optionally not to increment depth for external hosts - posted by "Markus Jelsma (JIRA)" <ji...@apache.org> on 2014/02/21 17:58:20 UTC, 0 replies.
- [jira] [Commented] (NUTCH-1671) indexchecker to add digest field - posted by "Sebastian Nagel (JIRA)" <ji...@apache.org> on 2014/02/21 21:27:19 UTC, 0 replies.
- [jira] [Updated] (NUTCH-1605) mime type detector recognizes xlsx as zip file - posted by "Sebastian Nagel (JIRA)" <ji...@apache.org> on 2014/02/21 22:35:28 UTC, 1 replies.
- [jira] [Comment Edited] (NUTCH-1726) HeadingsFilter does not find nested nodes - posted by "lufeng (JIRA)" <ji...@apache.org> on 2014/02/24 15:43:19 UTC, 0 replies.
- [jira] [Updated] (NUTCH-1679) UpdateDb using batchId, link may override crawled page. - posted by "Koen Smets (JIRA)" <ji...@apache.org> on 2014/02/25 12:58:20 UTC, 0 replies.
- [jira] [Commented] (NUTCH-841) Create a Wicket-based Web Application for Nutch - posted by "Fjodor Vershinin (JIRA)" <ji...@apache.org> on 2014/02/26 18:45:27 UTC, 1 replies.
- [jira] [Updated] (NUTCH-1727) Configurable length for Tlds - posted by "Lewis John McGibbney (JIRA)" <ji...@apache.org> on 2014/02/26 18:45:27 UTC, 2 replies.
- [jira] [Created] (NUTCH-1731) Add stop flag to NutchServer - posted by "Lewis John McGibbney (JIRA)" <ji...@apache.org> on 2014/02/26 18:58:21 UTC, 0 replies.
- [jira] [Assigned] (NUTCH-1731) Better cmd line parsing for NutchServer - posted by "Lewis John McGibbney (JIRA)" <ji...@apache.org> on 2014/02/27 00:33:21 UTC, 0 replies.
- [jira] [Updated] (NUTCH-1731) Better cmd line parsing for NutchServer - posted by "Lewis John McGibbney (JIRA)" <ji...@apache.org> on 2014/02/27 00:33:21 UTC, 0 replies.
- [jira] [Work started] (NUTCH-1731) Better cmd line parsing for NutchServer - posted by "Lewis John McGibbney (JIRA)" <ji...@apache.org> on 2014/02/27 00:35:19 UTC, 0 replies.
- [jira] [Commented] (NUTCH-1253) Incompatible neko and xerces versions - posted by "Yasin Kılınç (JIRA)" <ji...@apache.org> on 2014/02/28 09:33:23 UTC, 2 replies.
- [jira] [Updated] (NUTCH-1478) Parse-metatags and index-metadata plugin for Nutch 2.x series - posted by "Talat UYARER (JIRA)" <ji...@apache.org> on 2014/02/28 11:01:25 UTC, 0 replies.
- [jira] [Created] (NUTCH-1732) IndexerMapReduce to delete explicitly not indexable documents - posted by "Sebastian Nagel (JIRA)" <ji...@apache.org> on 2014/02/28 15:41:22 UTC, 0 replies.
- [jira] [Comment Edited] (NUTCH-1113) Merging segments causes URLs to vanish from crawldb/index? - posted by "Sebastian Nagel (JIRA)" <ji...@apache.org> on 2014/02/28 15:45:21 UTC, 0 replies.
- [jira] [Commented] (NUTCH-1732) IndexerMapReduce to delete explicitly not indexable documents - posted by "Markus Jelsma (JIRA)" <ji...@apache.org> on 2014/02/28 15:49:21 UTC, 1 replies.
- HTTP Post request - posted by Zabini <an...@actimage.com> on 2014/02/28 16:06:49 UTC, 0 replies.
- [jira] [Resolved] (NUTCH-1113) Merging segments causes URLs to vanish from crawldb/index? - posted by "Markus Jelsma (JIRA)" <ji...@apache.org> on 2014/02/28 16:46:21 UTC, 0 replies.
- Build failed in Jenkins: Nutch-trunk #2545 - posted by Apache Jenkins Server <je...@builds.apache.org> on 2014/02/28 17:44:45 UTC, 0 replies.