You are viewing a plain text version of this content. The canonical link for it is here.
- [jira] [Commented] (NUTCH-2191) Add protocol-htmlunit - posted by "Lewis John McGibbney (JIRA)" <ji...@apache.org> on 2016/04/01 19:03:25 UTC, 7 replies.
- Reg. License of Princeton WordNet - posted by Bhavya Sanghavi <bh...@gmail.com> on 2016/04/02 01:07:50 UTC, 1 replies.
- [GitHub] nutch pull request: Fix for NUTCH-2245 NGram Model for Cosine Simi... - posted by lewismc <gi...@git.apache.org> on 2016/04/02 01:13:53 UTC, 5 replies.
- [jira] [Commented] (NUTCH-2245) Developed the NGram Model on the existing Unigram Cosine Similarity Model - posted by "ASF GitHub Bot (JIRA)" <ji...@apache.org> on 2016/04/02 01:14:25 UTC, 6 replies.
- [jira] [Created] (NUTCH-2246) Refactor /seed endpoint for backward compatibility - posted by "Sujen Shah (JIRA)" <ji...@apache.org> on 2016/04/03 04:33:25 UTC, 0 replies.
- [jira] [Work started] (NUTCH-2245) Developed the NGram Model on the existing Unigram Cosine Similarity Model - posted by "Sujen Shah (JIRA)" <ji...@apache.org> on 2016/04/03 04:35:25 UTC, 0 replies.
- [jira] [Updated] (NUTCH-2245) Developed the NGram Model on the existing Unigram Cosine Similarity Model - posted by "Sujen Shah (JIRA)" <ji...@apache.org> on 2016/04/03 04:35:25 UTC, 0 replies.
- Build failed in Jenkins: Nutch-trunk #3358 - posted by Apache Jenkins Server <je...@builds.apache.org> on 2016/04/04 09:57:11 UTC, 0 replies.
- Recent stackoverflow questions - posted by Sujen Shah <su...@apache.org> on 2016/04/05 22:30:46 UTC, 2 replies.
- FW: Apache Tika used to parse the Panama papers! - posted by "Mattmann, Chris A (3980)" <ch...@jpl.nasa.gov> on 2016/04/06 01:13:46 UTC, 0 replies.
- [jira] [Updated] (NUTCH-2247) Protocol resolver - posted by "Markus Jelsma (JIRA)" <ji...@apache.org> on 2016/04/06 18:25:25 UTC, 0 replies.
- [jira] [Created] (NUTCH-2247) Protocol resolver - posted by "Markus Jelsma (JIRA)" <ji...@apache.org> on 2016/04/06 18:25:25 UTC, 0 replies.
- [GitHub] nutch pull request: NUTCH-2222 re-fetch deletes all metadata excep... - posted by asfgit <gi...@git.apache.org> on 2016/04/07 17:16:24 UTC, 0 replies.
- [jira] [Commented] (NUTCH-2222) re-fetch deletes all metadata except _csh_ and _rs_ - posted by "ASF GitHub Bot (JIRA)" <ji...@apache.org> on 2016/04/07 17:16:25 UTC, 2 replies.
- Build failed in Jenkins: Nutch-nutchgora #1551 - posted by Apache Jenkins Server <je...@builds.apache.org> on 2016/04/07 17:44:20 UTC, 0 replies.
- [jira] [Created] (NUTCH-2248) CSS parser plugin - posted by "Joseph Naegele (JIRA)" <ji...@apache.org> on 2016/04/07 23:43:25 UTC, 0 replies.
- [GitHub] nutch pull request: NUTCH-2248 CSS Parser plugin parse-css - posted by naegelejd <gi...@git.apache.org> on 2016/04/07 23:47:13 UTC, 0 replies.
- [jira] [Commented] (NUTCH-2248) CSS parser plugin - posted by "ASF GitHub Bot (JIRA)" <ji...@apache.org> on 2016/04/07 23:47:25 UTC, 2 replies.
- Re: Nutch: Tika Parser error while parsing an image - posted by Karanjeet Singh <ka...@usc.edu> on 2016/04/08 12:18:37 UTC, 1 replies.
- [jira] [Updated] (NUTCH-2191) Add protocol-htmlunit - posted by "Chris A. Mattmann (JIRA)" <ji...@apache.org> on 2016/04/08 18:13:25 UTC, 0 replies.
- Maven Central Plugins - posted by Lewis John Mcgibbney <le...@gmail.com> on 2016/04/10 15:54:47 UTC, 2 replies.
- [GitHub] nutch pull request: NUTCH-2248 CSS parser plugin - posted by lewismc <gi...@git.apache.org> on 2016/04/11 19:10:26 UTC, 1 replies.
- [GitHub] nutch pull request: fix for NUTCH-2238 contributed by ptorrestr - posted by lewismc <gi...@git.apache.org> on 2016/04/11 21:25:17 UTC, 3 replies.
- [jira] [Commented] (NUTCH-2238) Indexer for Elasticsearch 2.x - posted by "ASF GitHub Bot (JIRA)" <ji...@apache.org> on 2016/04/11 21:25:25 UTC, 4 replies.
- [jira] [Created] (NUTCH-2249) WordNet Integration for Cosine Similarity - posted by "Bhavya Sanghavi (JIRA)" <ji...@apache.org> on 2016/04/13 00:16:25 UTC, 0 replies.
- [jira] [Commented] (NUTCH-2249) WordNet Integration for Cosine Similarity - posted by "Bhavya Sanghavi (JIRA)" <ji...@apache.org> on 2016/04/13 00:18:25 UTC, 0 replies.
- [Nutch Wiki] Update of "AdvancedAjaxInteraction" by ChrisMattmann - posted by Apache Wiki <wi...@apache.org> on 2016/04/13 07:42:03 UTC, 0 replies.
- Adding a new field to Nutch + MongoDB datastore using plugin - posted by Jean Vence <jv...@gmail.com> on 2016/04/13 13:49:21 UTC, 0 replies.
- [jira] [Updated] (NUTCH-2249) WordNet Integration for Cosine Similarity - posted by "Sujen Shah (JIRA)" <ji...@apache.org> on 2016/04/13 19:51:25 UTC, 1 replies.
- [jira] [Assigned] (NUTCH-2249) WordNet Integration for Cosine Similarity - posted by "Sujen Shah (JIRA)" <ji...@apache.org> on 2016/04/13 19:51:25 UTC, 0 replies.
- [jira] [Updated] (NUTCH-2238) Indexer for Elasticsearch 2.x - posted by "Lewis John McGibbney (JIRA)" <ji...@apache.org> on 2016/04/13 20:30:25 UTC, 1 replies.
- [jira] [Updated] (NUTCH-2222) re-fetch deletes all metadata except _csh_ and _rs_ - posted by "Lewis John McGibbney (JIRA)" <ji...@apache.org> on 2016/04/13 20:31:25 UTC, 0 replies.
- [jira] [Updated] (NUTCH-1741) Support of Sitemaps in Nutch 2.x - posted by "Lewis John McGibbney (JIRA)" <ji...@apache.org> on 2016/04/13 20:32:25 UTC, 0 replies.
- [jira] [Resolved] (NUTCH-2238) Indexer for Elasticsearch 2.x - posted by "Lewis John McGibbney (JIRA)" <ji...@apache.org> on 2016/04/13 20:33:25 UTC, 0 replies.
- [jira] [Updated] (NUTCH-2217) Crawl pages with specified language - posted by "Lewis John McGibbney (JIRA)" <ji...@apache.org> on 2016/04/13 20:38:25 UTC, 0 replies.
- [jira] [Updated] (NUTCH-2188) While crawling with solr url (kerberos enabled) Error: org.apache.solr.common.SolrException: Unauthorized - posted by "Lewis John McGibbney (JIRA)" <ji...@apache.org> on 2016/04/13 20:39:25 UTC, 0 replies.
- Build failed in Jenkins: Nutch-nutchgora #1552 - posted by Apache Jenkins Server <je...@builds.apache.org> on 2016/04/13 20:45:59 UTC, 0 replies.
- [jira] [Created] (NUTCH-2250) CommonCrawlDumper : Invalid format + skipped parts - posted by "Thamme Gowda N (JIRA)" <ji...@apache.org> on 2016/04/14 11:21:25 UTC, 0 replies.
- [GitHub] nutch pull request: NUTCH-2250 : CommonCrawlDumper : Invalid forma... - posted by thammegowda <gi...@git.apache.org> on 2016/04/14 11:44:51 UTC, 3 replies.
- [jira] [Commented] (NUTCH-2250) CommonCrawlDumper : Invalid format + skipped parts - posted by "ASF GitHub Bot (JIRA)" <ji...@apache.org> on 2016/04/14 11:45:25 UTC, 4 replies.
- [jira] [Created] (NUTCH-2251) Make CommonCrawlFormatJackson instance reusable for by properly handling object state when it used to format many documents - posted by "Thamme Gowda N (JIRA)" <ji...@apache.org> on 2016/04/15 07:11:25 UTC, 0 replies.
- [jira] [Updated] (NUTCH-2251) Make CommonCrawlFormatJackson instance reusable by properly handling object state - posted by "Thamme Gowda N (JIRA)" <ji...@apache.org> on 2016/04/15 07:11:25 UTC, 0 replies.
- [jira] [Assigned] (NUTCH-2250) CommonCrawlDumper : Invalid format + skipped parts - posted by "Chris A. Mattmann (JIRA)" <ji...@apache.org> on 2016/04/15 07:36:25 UTC, 0 replies.
- [jira] [Work started] (NUTCH-2250) CommonCrawlDumper : Invalid format + skipped parts - posted by "Chris A. Mattmann (JIRA)" <ji...@apache.org> on 2016/04/15 07:36:25 UTC, 0 replies.
- [jira] [Created] (NUTCH-2252) Allow phantomjs as a browser for selenium options - posted by "Kim Whitehall (JIRA)" <ji...@apache.org> on 2016/04/16 18:08:25 UTC, 0 replies.
- [GitHub] nutch pull request: Nutch-2252: Allow phantomjs as a browser for s... - posted by kwhitehall <gi...@git.apache.org> on 2016/04/16 21:58:17 UTC, 0 replies.
- [jira] [Commented] (NUTCH-2252) Allow phantomjs as a browser for selenium options - posted by "Kim Whitehall (JIRA)" <ji...@apache.org> on 2016/04/16 22:00:26 UTC, 0 replies.
- [jira] [Assigned] (NUTCH-2252) Allow phantomjs as a browser for selenium options - posted by "Lewis John McGibbney (JIRA)" <ji...@apache.org> on 2016/04/16 23:05:25 UTC, 0 replies.
- [jira] [Updated] (NUTCH-2252) Allow phantomjs as a browser for selenium options - posted by "Lewis John McGibbney (JIRA)" <ji...@apache.org> on 2016/04/16 23:05:25 UTC, 1 replies.
- [jira] [Updated] (NUTCH-2250) CommonCrawlDumper : Invalid format + skipped parts - posted by "Chris A. Mattmann (JIRA)" <ji...@apache.org> on 2016/04/18 00:33:25 UTC, 0 replies.
- [jira] [Resolved] (NUTCH-2250) CommonCrawlDumper : Invalid format + skipped parts - posted by "Chris A. Mattmann (JIRA)" <ji...@apache.org> on 2016/04/18 00:33:25 UTC, 0 replies.
- [GitHub] nutch pull request: fix for NUTCH-2191 contributed by karanjeets - posted by asfgit <gi...@git.apache.org> on 2016/04/18 00:35:58 UTC, 0 replies.
- [jira] [Resolved] (NUTCH-2191) Add protocol-htmlunit - posted by "Chris A. Mattmann (JIRA)" <ji...@apache.org> on 2016/04/18 00:37:25 UTC, 1 replies.
- Build failed in Jenkins: Nutch-trunk #3359 - posted by Apache Jenkins Server <je...@builds.apache.org> on 2016/04/18 00:44:47 UTC, 0 replies.
- [jira] [Reopened] (NUTCH-2191) Add protocol-htmlunit - posted by "Sebastian Nagel (JIRA)" <ji...@apache.org> on 2016/04/18 09:15:25 UTC, 0 replies.
- [GitHub] nutch pull request: fix for NUTCH-2191 - fixing Nutch build - cont... - posted by karanjeets <gi...@git.apache.org> on 2016/04/18 09:47:48 UTC, 1 replies.
- Build failed in Jenkins: Nutch-trunk #3360 - posted by Apache Jenkins Server <je...@builds.apache.org> on 2016/04/18 12:56:20 UTC, 0 replies.
- Jenkins build failures after git migration - posted by Sebastian Nagel <wa...@googlemail.com> on 2016/04/18 13:40:23 UTC, 3 replies.
- [jira] [Commented] (NUTCH-1604) ProtocolFactory not thread-safe - posted by "Leon Misakyan (JIRA)" <ji...@apache.org> on 2016/04/19 19:09:25 UTC, 0 replies.
- [jira] [Commented] (NUTCH-1785) Ability to index raw content - posted by "Federico Bonelli (JIRA)" <ji...@apache.org> on 2016/04/20 10:09:25 UTC, 2 replies.
- [jira] [Created] (NUTCH-2253) ProtocolFactory still not thread-safe - posted by "Leon Misakyan (JIRA)" <ji...@apache.org> on 2016/04/20 17:54:25 UTC, 0 replies.
- [jira] [Updated] (NUTCH-2253) ProtocolFactory still not thread-safe - posted by "Leon Misakyan (JIRA)" <ji...@apache.org> on 2016/04/20 17:55:25 UTC, 4 replies.
- Jenkins build is back to normal : Nutch-trunk #3361 - posted by Apache Jenkins Server <je...@builds.apache.org> on 2016/04/20 23:03:02 UTC, 0 replies.
- Jenkins build is back to normal : Nutch-nutchgora #1553 - posted by Apache Jenkins Server <je...@builds.apache.org> on 2016/04/20 23:28:53 UTC, 0 replies.
- [jira] [Created] (NUTCH-2254) Charset issues when using -addBinaryContent and -base64 options - posted by "Federico Bonelli (JIRA)" <ji...@apache.org> on 2016/04/21 14:36:25 UTC, 0 replies.
- [jira] [Updated] (NUTCH-2254) Charset issues when using -addBinaryContent and -base64 options - posted by "Federico Bonelli (JIRA)" <ji...@apache.org> on 2016/04/21 14:41:25 UTC, 0 replies.
- [jira] [Comment Edited] (NUTCH-2254) Charset issues when using -addBinaryContent and -base64 options - posted by "Federico Bonelli (JIRA)" <ji...@apache.org> on 2016/04/21 14:42:25 UTC, 0 replies.
- Re: GSoC 2016: You are a mentor for Furkan KAMACI - posted by Lewis John Mcgibbney <le...@gmail.com> on 2016/04/22 21:33:34 UTC, 0 replies.
- [jira] [Commented] (NUTCH-2254) Charset issues when using -addBinaryContent and -base64 options - posted by "Sebastian Nagel (JIRA)" <ji...@apache.org> on 2016/04/25 13:26:12 UTC, 4 replies.
- [GitHub] nutch pull request: Option to include inlinks in commonscrawl dump - posted by thammegowda <gi...@git.apache.org> on 2016/04/25 14:53:49 UTC, 1 replies.
- [GitHub] nutch pull request: NUTCH-2254 Indexer: character set issue with -... - posted by sebastian-nagel <gi...@git.apache.org> on 2016/04/25 15:23:32 UTC, 1 replies.
- [jira] [Assigned] (NUTCH-2254) Charset issues when using -addBinaryContent and -base64 options - posted by "Sebastian Nagel (JIRA)" <ji...@apache.org> on 2016/04/25 15:25:12 UTC, 0 replies.
- GSoC Acceptance for Security Layer for NutchServer (NUTCH-1756) - posted by Furkan KAMACI <fu...@gmail.com> on 2016/04/25 15:25:33 UTC, 0 replies.
- [Nutch Wiki] Trivial Update of "bin/nutch_generate" by SebastianNagel - posted by Apache Wiki <wi...@apache.org> on 2016/04/26 13:47:49 UTC, 0 replies.
- [jira] [Resolved] (NUTCH-2245) Developed the NGram Model on the existing Unigram Cosine Similarity Model - posted by "Sujen Shah (JIRA)" <ji...@apache.org> on 2016/04/26 19:34:12 UTC, 0 replies.
- [jira] [Created] (NUTCH-2255) WARCExporter to generate request records - posted by "Julien Nioche (JIRA)" <ji...@apache.org> on 2016/04/27 14:46:12 UTC, 0 replies.
- [jira] [Updated] (NUTCH-2255) WARCExporter to generate request records - posted by "Julien Nioche (JIRA)" <ji...@apache.org> on 2016/04/27 14:50:13 UTC, 0 replies.
- [jira] [Updated] (NUTCH-1756) Security layer for NutchServer - posted by "Lewis John McGibbney (JIRA)" <ji...@apache.org> on 2016/04/27 20:58:12 UTC, 0 replies.
- [jira] [Resolved] (NUTCH-2254) Charset issues when using -addBinaryContent and -base64 options - posted by "Sebastian Nagel (JIRA)" <ji...@apache.org> on 2016/04/27 23:00:16 UTC, 0 replies.
- [jira] [Created] (NUTCH-2256) Inconsistent log level practice - posted by "songwanging (JIRA)" <ji...@apache.org> on 2016/04/29 00:02:12 UTC, 0 replies.
- Need to update version control page and SVN docs to point to Git - posted by "Mattmann, Chris A (3980)" <ch...@jpl.nasa.gov> on 2016/04/29 03:14:38 UTC, 0 replies.
- [jira] [Commented] (NUTCH-1824) protocol-http using proxy not working with https sites - posted by "Jason Wang (JIRA)" <ji...@apache.org> on 2016/04/29 04:27:12 UTC, 0 replies.
- [jira] [Updated] (NUTCH-2256) Inconsistent log level practice - posted by "Sebastian Nagel (JIRA)" <ji...@apache.org> on 2016/04/29 18:11:12 UTC, 1 replies.
- [jira] [Commented] (NUTCH-2256) Inconsistent log level practice - posted by "Sebastian Nagel (JIRA)" <ji...@apache.org> on 2016/04/29 18:15:12 UTC, 1 replies.
- [jira] [Assigned] (NUTCH-2256) Inconsistent log level practice - posted by "Sebastian Nagel (JIRA)" <ji...@apache.org> on 2016/04/29 18:15:13 UTC, 0 replies.
- [jira] [Resolved] (NUTCH-2256) Inconsistent log level practice - posted by "Sebastian Nagel (JIRA)" <ji...@apache.org> on 2016/04/29 18:50:12 UTC, 0 replies.
- [jira] [Closed] (NUTCH-2256) Inconsistent log level practice - posted by "Sebastian Nagel (JIRA)" <ji...@apache.org> on 2016/04/29 19:07:13 UTC, 0 replies.
- Build failed in Jenkins: Nutch-nutchgora #1554 - posted by Apache Jenkins Server <je...@builds.apache.org> on 2016/04/29 19:39:20 UTC, 0 replies.