You are viewing a plain text version of this content. The canonical link for it is here.
- [jira] Closed: (NUTCH-895) Urls with characters like [? = ] getting filtered out. - posted by "Julien Nioche (JIRA)" <ji...@apache.org> on 2010/09/01 10:10:53 UTC, 0 replies.
- crawling webpage results - posted by Shanthoosh PV <sh...@flipkart.com> on 2010/09/01 10:14:21 UTC, 1 replies.
- [jira] Commented: (NUTCH-891) Nutch build should not depend on unversioned local deps - posted by "Enis Soztutar (JIRA)" <ji...@apache.org> on 2010/09/01 14:13:53 UTC, 0 replies.
- Nutch 2.0 Help - posted by David Stuart <da...@progressivealliance.co.uk> on 2010/09/02 13:58:18 UTC, 5 replies.
- [Nutch Wiki] Update of "androidyou" by androidyou - posted by Apache Wiki <wi...@apache.org> on 2010/09/03 20:19:10 UTC, 0 replies.
- [Nutch Wiki] Update of "RunningNutchAndSolr" by androidyou - posted by Apache Wiki <wi...@apache.org> on 2010/09/03 20:21:56 UTC, 0 replies.
- [jira] Created: (NUTCH-897) Subcollection requires blacklist element - posted by "Markus Jelsma (JIRA)" <ji...@apache.org> on 2010/09/06 13:37:32 UTC, 0 replies.
- [jira] Commented: (NUTCH-716) Make subcollection index filed multivalued - posted by "Markus Jelsma (JIRA)" <ji...@apache.org> on 2010/09/06 13:52:33 UTC, 0 replies.
- [jira] Issue Comment Edited: (NUTCH-716) Make subcollection index filed multivalued - posted by "Markus Jelsma (JIRA)" <ji...@apache.org> on 2010/09/06 15:34:33 UTC, 2 replies.
- unsubscribe - posted by SeongPyo Kim <sp...@gmail.com> on 2010/09/06 17:32:34 UTC, 0 replies.
- [jira] Created: (NUTCH-898) Multi valued subcollection is not multi valued - posted by "Markus Jelsma (JIRA)" <ji...@apache.org> on 2010/09/06 18:44:33 UTC, 0 replies.
- [jira] Closed: (NUTCH-898) Multi valued subcollection is not multi valued - posted by "Markus Jelsma (JIRA)" <ji...@apache.org> on 2010/09/07 13:16:32 UTC, 0 replies.
- nutch 2.0 (trunk) - posted by Faruk Berksöz <fb...@gmail.com> on 2010/09/07 14:50:13 UTC, 2 replies.
- [jira] Created: (NUTCH-899) java.sql.BatchUpdateException: Data truncation: Data too long for column 'content' at row 1 - posted by "faruk berksöz (JIRA)" <ji...@apache.org> on 2010/09/07 15:40:34 UTC, 0 replies.
- [jira] Commented: (NUTCH-899) java.sql.BatchUpdateException: Data truncation: Data too long for column 'content' at row 1 - posted by "Julien Nioche (JIRA)" <ji...@apache.org> on 2010/09/07 16:30:33 UTC, 1 replies.
- [jira] Closed: (NUTCH-899) java.sql.BatchUpdateException: Data truncation: Data too long for column 'content' at row 1 - posted by "faruk berksöz (JIRA)" <ji...@apache.org> on 2010/09/07 17:31:33 UTC, 0 replies.
- [jira] Created: (NUTCH-900) Confusion in nutch-default between http.content.limit and file.content.limit - posted by "Markus Jelsma (JIRA)" <ji...@apache.org> on 2010/09/08 12:06:34 UTC, 0 replies.
- [jira] Updated: (NUTCH-900) Confusion in nutch-default between http.content.limit and file.content.limit - posted by "Markus Jelsma (JIRA)" <ji...@apache.org> on 2010/09/08 12:08:34 UTC, 2 replies.
- [jira] Created: (NUTCH-901) Make index-more plug-in configurable - posted by "Markus Jelsma (JIRA)" <ji...@apache.org> on 2010/09/08 12:42:33 UTC, 0 replies.
- [Nutch Wiki] Update of "GORA_HBase" by JulienNioche - posted by Apache Wiki <wi...@apache.org> on 2010/09/08 12:47:18 UTC, 0 replies.
- [Nutch Wiki] Update of "FrontPage" by JulienNioche - posted by Apache Wiki <wi...@apache.org> on 2010/09/08 12:47:51 UTC, 0 replies.
- [jira] Updated: (NUTCH-901) Make index-more plug-in configurable - posted by "Julien Nioche (JIRA)" <ji...@apache.org> on 2010/09/08 12:56:32 UTC, 4 replies.
- [jira] Assigned: (NUTCH-900) Confusion in nutch-default between http.content.limit and file.content.limit - posted by "Julien Nioche (JIRA)" <ji...@apache.org> on 2010/09/08 13:02:33 UTC, 0 replies.
- [jira] Closed: (NUTCH-900) Confusion in nutch-default between http.content.limit and file.content.limit - posted by "Julien Nioche (JIRA)" <ji...@apache.org> on 2010/09/08 13:06:33 UTC, 0 replies.
- [jira] Commented: (NUTCH-407) Make Nutch crawling parent directories for file protocol configurable - posted by "Andrey Sapegin (JIRA)" <ji...@apache.org> on 2010/09/08 14:12:33 UTC, 1 replies.
- [jira] Updated: (NUTCH-893) DataStore.put() silently loses records when executed from multiple processes - posted by "Enis Soztutar (JIRA)" <ji...@apache.org> on 2010/09/08 15:23:32 UTC, 0 replies.
- [jira] Created: (NUTCH-902) Add all necessary files and configuration so that nutch can be used with different backends out-of-the-box - posted by "Enis Soztutar (JIRA)" <ji...@apache.org> on 2010/09/08 15:39:32 UTC, 0 replies.
- [jira] Created: (NUTCH-903) RESUME_KEY field in FetcherJob.Java has not been get correctly - posted by "faruk berksöz (JIRA)" <ji...@apache.org> on 2010/09/08 16:46:33 UTC, 0 replies.
- [jira] Updated: (NUTCH-903) RESUME_KEY field in FetcherJob.Java has not been get correctly - posted by "faruk berksöz (JIRA)" <ji...@apache.org> on 2010/09/08 16:48:32 UTC, 4 replies.
- [jira] Closed: (NUTCH-903) RESUME_KEY field in FetcherJob.Java has not been get correctly - posted by "faruk berksöz (JIRA)" <ji...@apache.org> on 2010/09/08 17:02:34 UTC, 0 replies.
- [jira] Created: (NUTCH-904) "-resume" option is always processed as "false" in FetcherJob. - posted by "faruk berksöz (JIRA)" <ji...@apache.org> on 2010/09/08 17:54:33 UTC, 0 replies.
- [jira] Updated: (NUTCH-904) "-resume" option is always processed as "false" in FetcherJob. - posted by "faruk berksöz (JIRA)" <ji...@apache.org> on 2010/09/08 18:00:37 UTC, 0 replies.
- [jira] Commented: (NUTCH-893) DataStore.put() silently loses records when executed from multiple processes - posted by "Andrzej Bialecki (JIRA)" <ji...@apache.org> on 2010/09/08 18:36:33 UTC, 2 replies.
- Re: [VOTE] Apache Nutch 1.2 Release Candidate #1 - posted by Andrzej Bialecki <ab...@getopt.org> on 2010/09/10 18:59:18 UTC, 1 replies.
- [jira] Created: (NUTCH-905) Configurable file protocol parent directory crawling - posted by "Chris A. Mattmann (JIRA)" <ji...@apache.org> on 2010/09/11 03:57:41 UTC, 0 replies.
- [jira] Work started: (NUTCH-905) Configurable file protocol parent directory crawling - posted by "Chris A. Mattmann (JIRA)" <ji...@apache.org> on 2010/09/11 03:59:32 UTC, 0 replies.
- [jira] Resolved: (NUTCH-905) Configurable file protocol parent directory crawling - posted by "Chris A. Mattmann (JIRA)" <ji...@apache.org> on 2010/09/11 05:51:35 UTC, 0 replies.
- [VOTE] Apache Nutch 1.2 Release Candidate #2 - posted by "Mattmann, Chris A (388J)" <ch...@jpl.nasa.gov> on 2010/09/11 07:01:35 UTC, 0 replies.
- [Nutch Wiki] Update of "Crawl" by cmd - posted by Apache Wiki <wi...@apache.org> on 2010/09/13 03:11:04 UTC, 0 replies.
- Crawl reverted to revision 10 on Nutch Wiki - posted by Apache Wiki <wi...@apache.org> on 2010/09/13 03:59:02 UTC, 0 replies.
- [jira] Closed: (NUTCH-893) DataStore.put() silently loses records when executed from multiple processes - posted by "Doğacan Güney (JIRA)" <ji...@apache.org> on 2010/09/13 16:19:34 UTC, 0 replies.
- [jira] Created: (NUTCH-906) Nutch OpenSearch sometimes raises DOMExceptions due to Lucene column names not being valid XML tag names - posted by "Asheesh Laroia (JIRA)" <ji...@apache.org> on 2010/09/13 21:14:36 UTC, 0 replies.
- [jira] Updated: (NUTCH-906) Nutch OpenSearch sometimes raises DOMExceptions due to Lucene column names not being valid XML tag names - posted by "Asheesh Laroia (JIRA)" <ji...@apache.org> on 2010/09/13 21:16:50 UTC, 0 replies.
- [jira] Closed: (NUTCH-904) "-resume" option is always processed as "false" in FetcherJob. - posted by "Doğacan Güney (JIRA)" <ji...@apache.org> on 2010/09/14 15:45:54 UTC, 0 replies.
- [jira] Commented: (NUTCH-882) Design a Host table in GORA - posted by "Doğacan Güney (JIRA)" <ji...@apache.org> on 2010/09/15 14:01:35 UTC, 3 replies.
- [jira] Commented: (NUTCH-864) Fetcher generates entries with status 0 - posted by "Doğacan Güney (JIRA)" <ji...@apache.org> on 2010/09/15 14:46:33 UTC, 1 replies.
- [Nutch Wiki] Trivial Update of "MapReduce" by AndreRicardo - posted by Apache Wiki <wi...@apache.org> on 2010/09/15 15:57:09 UTC, 0 replies.
- [jira] Created: (NUTCH-907) DataStore API doesn't support multiple storage areas for multiple disjoint crawls - posted by "Andrzej Bialecki (JIRA)" <ji...@apache.org> on 2010/09/15 17:00:34 UTC, 0 replies.
- [Nutch Wiki] Update of "InternalDocumentation" by AndrzejBialecki - posted by Apache Wiki <wi...@apache.org> on 2010/09/15 22:51:28 UTC, 0 replies.
- [Nutch Wiki] Update of "CrawlDatumStates" by AndrzejBialecki - posted by Apache Wiki <wi...@apache.org> on 2010/09/15 22:58:06 UTC, 1 replies.
- New attachment added to page CrawlDatumStates on Nutch Wiki - posted by Apache Wiki <wi...@apache.org> on 2010/09/15 22:58:40 UTC, 1 replies.
- [jira] Commented: (NUTCH-907) DataStore API doesn't support multiple storage areas for multiple disjoint crawls - posted by "Doğacan Güney (JIRA)" <ji...@apache.org> on 2010/09/16 13:28:33 UTC, 1 replies.
- [jira] Assigned: (NUTCH-880) REST API (and webapp) for Nutch - posted by "Andrzej Bialecki (JIRA)" <ji...@apache.org> on 2010/09/16 14:54:33 UTC, 0 replies.
- [jira] Updated: (NUTCH-880) REST API (and webapp) for Nutch - posted by "Andrzej Bialecki (JIRA)" <ji...@apache.org> on 2010/09/16 14:56:32 UTC, 0 replies.
- [jira] Created: (NUTCH-908) Infinite Loop and Null Pointer Bugs in Searching - posted by "Dennis Kubes (JIRA)" <ji...@apache.org> on 2010/09/16 22:10:50 UTC, 0 replies.
- [jira] Updated: (NUTCH-908) Infinite Loop and Null Pointer Bugs in Searching - posted by "Dennis Kubes (JIRA)" <ji...@apache.org> on 2010/09/16 22:12:32 UTC, 0 replies.
- [jira] Commented: (NUTCH-862) HttpClient null pointer exception - posted by "Peter Lundberg (JIRA)" <ji...@apache.org> on 2010/09/17 06:17:34 UTC, 0 replies.
- [jira] Assigned: (NUTCH-908) Infinite Loop and Null Pointer Bugs in Searching - posted by "Chris A. Mattmann (JIRA)" <ji...@apache.org> on 2010/09/17 06:23:33 UTC, 0 replies.
- [jira] Commented: (NUTCH-908) Infinite Loop and Null Pointer Bugs in Searching - posted by "Chris A. Mattmann (JIRA)" <ji...@apache.org> on 2010/09/17 06:23:34 UTC, 0 replies.
- [jira] Work started: (NUTCH-908) Infinite Loop and Null Pointer Bugs in Searching - posted by "Chris A. Mattmann (JIRA)" <ji...@apache.org> on 2010/09/17 06:23:34 UTC, 0 replies.
- [jira] Assigned: (NUTCH-862) HttpClient null pointer exception - posted by "Andrzej Bialecki (JIRA)" <ji...@apache.org> on 2010/09/17 16:42:35 UTC, 0 replies.
- [jira] Resolved: (NUTCH-862) HttpClient null pointer exception - posted by "Andrzej Bialecki (JIRA)" <ji...@apache.org> on 2010/09/17 16:54:33 UTC, 0 replies.
- [jira] Assigned: (NUTCH-906) Nutch OpenSearch sometimes raises DOMExceptions due to Lucene column names not being valid XML tag names - posted by "Andrzej Bialecki (JIRA)" <ji...@apache.org> on 2010/09/17 21:05:35 UTC, 0 replies.
- [jira] Resolved: (NUTCH-906) Nutch OpenSearch sometimes raises DOMExceptions due to Lucene column names not being valid XML tag names - posted by "Andrzej Bialecki (JIRA)" <ji...@apache.org> on 2010/09/17 21:07:35 UTC, 0 replies.
- [jira] Resolved: (NUTCH-908) Infinite Loop and Null Pointer Bugs in Searching - posted by "Chris A. Mattmann (JIRA)" <ji...@apache.org> on 2010/09/19 06:50:43 UTC, 0 replies.
- [VOTE] Apache Nutch 1.2 Release Candidate #3 - posted by "Mattmann, Chris A (388J)" <ch...@jpl.nasa.gov> on 2010/09/19 18:04:15 UTC, 1 replies.
- Build failed in Hudson: Nutch-trunk #1252 - posted by Apache Hudson Server <hu...@hudson.apache.org> on 2010/09/20 06:02:19 UTC, 0 replies.
- [jira] Created: (NUTCH-909) Add alternative search-provider to Nutch site - posted by "Alex Baranau (JIRA)" <ji...@apache.org> on 2010/09/20 15:56:35 UTC, 0 replies.
- [jira] Updated: (NUTCH-909) Add alternative search-provider to Nutch site - posted by "Alex Baranau (JIRA)" <ji...@apache.org> on 2010/09/20 15:58:33 UTC, 1 replies.
- [jira] Commented: (NUTCH-909) Add alternative search-provider to Nutch site - posted by "Alex Baranau (JIRA)" <ji...@apache.org> on 2010/09/20 16:02:33 UTC, 3 replies.
- [jira] Assigned: (NUTCH-909) Add alternative search-provider to Nutch site - posted by "Chris A. Mattmann (JIRA)" <ji...@apache.org> on 2010/09/20 16:26:40 UTC, 0 replies.
- [jira] Work started: (NUTCH-909) Add alternative search-provider to Nutch site - posted by "Chris A. Mattmann (JIRA)" <ji...@apache.org> on 2010/09/20 16:26:41 UTC, 0 replies.
- [jira] Updated: (NUTCH-894) Move statistical language identification from indexing to parsing step - posted by "Sertan Alkan (JIRA)" <ji...@apache.org> on 2010/09/20 17:54:33 UTC, 0 replies.
- [jira] Issue Comment Edited: (NUTCH-901) Make index-more plug-in configurable - posted by "Markus Jelsma (JIRA)" <ji...@apache.org> on 2010/09/20 17:54:35 UTC, 0 replies.
- [jira] Assigned: (NUTCH-901) Make index-more plug-in configurable - posted by "Chris A. Mattmann (JIRA)" <ji...@apache.org> on 2010/09/21 03:49:34 UTC, 0 replies.
- [jira] Work started: (NUTCH-901) Make index-more plug-in configurable - posted by "Chris A. Mattmann (JIRA)" <ji...@apache.org> on 2010/09/21 03:55:32 UTC, 0 replies.
- [jira] Resolved: (NUTCH-901) Make index-more plug-in configurable - posted by "Chris A. Mattmann (JIRA)" <ji...@apache.org> on 2010/09/21 06:06:43 UTC, 0 replies.
- [VOTE] Apache Nutch 1.2 Release Candidate #4 - posted by "Mattmann, Chris A (388J)" <ch...@jpl.nasa.gov> on 2010/09/21 07:10:24 UTC, 5 replies.
- [jira] Commented: (NUTCH-896) Gora-based tests need to have their own config files - posted by "Sertan Alkan (JIRA)" <ji...@apache.org> on 2010/09/21 15:39:34 UTC, 0 replies.
- [jira] Commented: (NUTCH-880) REST API (and webapp) for Nutch - posted by "Doğacan Güney (JIRA)" <ji...@apache.org> on 2010/09/21 17:44:34 UTC, 1 replies.
- Constellio Enterprise Search announces its first Open Source release - posted by Rida Benjelloun <ri...@doculibre.com> on 2010/09/22 22:16:38 UTC, 0 replies.
- Build failed in Hudson: Nutch-trunk #1254 - posted by Apache Hudson Server <hu...@hudson.apache.org> on 2010/09/23 06:13:40 UTC, 0 replies.
- Build failed in Hudson: Nutch-trunk #1255 - posted by Apache Hudson Server <hu...@hudson.apache.org> on 2010/09/24 06:05:53 UTC, 0 replies.
- [RESULT] [VOTE] Apache Nutch 1.2 Release Candidate #4 - posted by "Mattmann, Chris A (388J)" <ch...@jpl.nasa.gov> on 2010/09/24 23:33:51 UTC, 0 replies.
- [Nutch Wiki] Update of "FrontPage" by ChrisMattmann - posted by Apache Wiki <wi...@apache.org> on 2010/09/25 00:16:45 UTC, 0 replies.
- [ANNOUNCE] Apache Nutch 1.2 released - posted by "Mattmann, Chris A (388J)" <ch...@jpl.nasa.gov> on 2010/09/25 00:21:16 UTC, 0 replies.
- Build failed in Hudson: Nutch-trunk #1257 - posted by Apache Hudson Server <hu...@hudson.apache.org> on 2010/09/26 06:08:26 UTC, 0 replies.
- [jira] Resolved: (NUTCH-909) Add alternative search-provider to Nutch site - posted by "Chris A. Mattmann (JIRA)" <ji...@apache.org> on 2010/09/27 04:21:39 UTC, 0 replies.
- [jira] Work started: (NUTCH-73) A page for CSV results - posted by "Chris A. Mattmann (JIRA)" <ji...@apache.org> on 2010/09/27 04:40:36 UTC, 0 replies.
- [jira] Commented: (NUTCH-73) A page for CSV results - posted by "Chris A. Mattmann (JIRA)" <ji...@apache.org> on 2010/09/27 04:40:36 UTC, 0 replies.
- [jira] Updated: (NUTCH-73) A page for CSV results - posted by "Chris A. Mattmann (JIRA)" <ji...@apache.org> on 2010/09/27 04:40:36 UTC, 0 replies.
- [jira] Resolved: (NUTCH-577) Use explicit tika-config.xml file to enable mime magic detection to be turned on and off - posted by "Chris A. Mattmann (JIRA)" <ji...@apache.org> on 2010/09/27 04:42:34 UTC, 0 replies.
- Build failed in Hudson: Nutch-trunk #1258 - posted by Apache Hudson Server <hu...@hudson.apache.org> on 2010/09/27 06:00:37 UTC, 0 replies.
- [jira] Updated: (NUTCH-907) DataStore API doesn't support multiple storage areas for multiple disjoint crawls - posted by "Sertan Alkan (JIRA)" <ji...@apache.org> on 2010/09/27 14:28:33 UTC, 0 replies.
- [Nutch Wiki] Update of "PublicServers" by JulienNioche - posted by Apache Wiki <wi...@apache.org> on 2010/09/27 15:51:42 UTC, 0 replies.
- [jira] Created: (NUTCH-910) Cached.jsp has a bug with encoding - posted by "Attila Pados (JIRA)" <ji...@apache.org> on 2010/09/27 17:05:33 UTC, 0 replies.
- [jira] Commented: (NUTCH-766) Tika parser - posted by "Diego Campo (JIRA)" <ji...@apache.org> on 2010/09/28 02:50:44 UTC, 0 replies.
- [jira] Updated: (NUTCH-910) Cached.jsp has a bug with encoding - posted by "Chris A. Mattmann (JIRA)" <ji...@apache.org> on 2010/09/28 02:57:32 UTC, 0 replies.
- [jira] Updated: (NUTCH-882) Design a Host table in GORA - posted by "Doğacan Güney (JIRA)" <ji...@apache.org> on 2010/09/28 16:34:49 UTC, 0 replies.
- Build failed in Hudson: Nutch-trunk #1261 - posted by Apache Hudson Server <hu...@hudson.apache.org> on 2010/09/30 06:31:30 UTC, 0 replies.