You are viewing a plain text version of this content. The canonical link for it is here.
- [jira] Created: (NUTCH-563) Include custom fields in BasicQueryFilter - posted by "julien nioche (JIRA)" <ji...@apache.org> on 2007/10/01 18:24:50 UTC, 0 replies.
- [jira] Updated: (NUTCH-563) Include custom fields in BasicQueryFilter - posted by "julien nioche (JIRA)" <ji...@apache.org> on 2007/10/01 18:24:51 UTC, 0 replies.
- Hits estimation? - posted by Hal Fulton <ru...@gmail.com> on 2007/10/01 18:36:28 UTC, 3 replies.
- [jira] Commented: (NUTCH-559) NTLM, Basic and Digest Authentication schemes for web/proxy server - posted by "Doğacan Güney (JIRA)" <ji...@apache.org> on 2007/10/03 09:42:50 UTC, 1 replies.
- [jira] Commented: (NUTCH-563) Include custom fields in BasicQueryFilter - posted by "Doğacan Güney (JIRA)" <ji...@apache.org> on 2007/10/03 09:46:50 UTC, 1 replies.
- Strange RemoteException thrown while doing a parse of ~64m documents - posted by Ned Rockson <nr...@stanford.edu> on 2007/10/03 10:11:15 UTC, 2 replies.
- Failed Fetch Pages - Index Verification and Optimization - posted by karthik085 <ka...@gmail.com> on 2007/10/03 23:15:44 UTC, 0 replies.
- [jira] Created: (NUTCH-564) External parser supports encoding attribute - posted by "Antony Bowesman (JIRA)" <ji...@apache.org> on 2007/10/03 23:42:50 UTC, 0 replies.
- [jira] Updated: (NUTCH-564) External parser supports encoding attribute - posted by "Antony Bowesman (JIRA)" <ji...@apache.org> on 2007/10/03 23:50:50 UTC, 2 replies.
- [jira] Commented: (NUTCH-508) ${hadoop.log.dir} and ${hadoop.log.file} are not propagated to the tasktracker - posted by "Doğacan Güney (JIRA)" <ji...@apache.org> on 2007/10/04 16:59:51 UTC, 2 replies.
- First Plugin - posted by Sagar Vibhute <sa...@gmail.com> on 2007/10/05 15:09:27 UTC, 6 replies.
- Two suggestions - posted by misc <mi...@robotgenius.net> on 2007/10/06 03:25:56 UTC, 2 replies.
- Java Packages (missing) - posted by Sagar Vibhute <sa...@gmail.com> on 2007/10/07 14:01:08 UTC, 2 replies.
- [jira] Updated: (NUTCH-562) Port mime type framework to use Tika mime detection framework - posted by "Chris A. Mattmann (JIRA)" <ji...@apache.org> on 2007/10/07 17:32:50 UTC, 1 replies.
- [jira] Issue Comment Edited: (NUTCH-562) Port mime type framework to use Tika mime detection framework - posted by "Chris A. Mattmann (JIRA)" <ji...@apache.org> on 2007/10/07 17:34:51 UTC, 0 replies.
- [jira] Resolved: (NUTCH-508) ${hadoop.log.dir} and ${hadoop.log.file} are not propagated to the tasktracker - posted by "Doğacan Güney (JIRA)" <ji...@apache.org> on 2007/10/08 12:58:50 UTC, 0 replies.
- [jira] Closed: (NUTCH-508) ${hadoop.log.dir} and ${hadoop.log.file} are not propagated to the tasktracker - posted by "Doğacan Güney (JIRA)" <ji...@apache.org> on 2007/10/08 13:00:50 UTC, 0 replies.
- [jira] Resolved: (NUTCH-562) Port mime type framework to use Tika mime detection framework - posted by "Chris A. Mattmann (JIRA)" <ji...@apache.org> on 2007/10/09 02:24:50 UTC, 0 replies.
- [jira] Closed: (NUTCH-562) Port mime type framework to use Tika mime detection framework - posted by "Chris A. Mattmann (JIRA)" <ji...@apache.org> on 2007/10/09 02:26:50 UTC, 5 replies.
- [jira] Commented: (NUTCH-562) Port mime type framework to use Tika mime detection framework - posted by "Hudson (JIRA)" <ji...@apache.org> on 2007/10/09 06:30:51 UTC, 0 replies.
- [jira] Created: (NUTCH-565) Arc File to Nutch Segments Converter - posted by "Dennis Kubes (JIRA)" <ji...@apache.org> on 2007/10/09 07:08:50 UTC, 0 replies.
- [jira] Updated: (NUTCH-565) Arc File to Nutch Segments Converter - posted by "Dennis Kubes (JIRA)" <ji...@apache.org> on 2007/10/09 07:11:50 UTC, 8 replies.
- InvertLinks logical problem? - posted by Ned Rockson <nr...@stanford.edu> on 2007/10/09 09:02:39 UTC, 0 replies.
- Disregard last post - posted by Ned Rockson <nr...@stanford.edu> on 2007/10/09 09:08:53 UTC, 0 replies.
- Re: Downloading file types to file system - posted by eyal edri <ey...@gmail.com> on 2007/10/09 11:30:24 UTC, 1 replies.
- [jira] Commented: (NUTCH-565) Arc File to Nutch Segments Converter - posted by "Sami Siren (JIRA)" <ji...@apache.org> on 2007/10/09 21:07:50 UTC, 9 replies.
- download code works in fetch class but not in plugins class - posted by eyal edri <ey...@gmail.com> on 2007/10/10 12:59:47 UTC, 0 replies.
- [jira] Created: (NUTCH-566) Sun's URL class has bug in creation of relative query URLs - posted by "Doug Cook (JIRA)" <ji...@apache.org> on 2007/10/10 17:56:50 UTC, 0 replies.
- [jira] Updated: (NUTCH-566) Sun's URL class has bug in creation of relative query URLs - posted by "Doug Cook (JIRA)" <ji...@apache.org> on 2007/10/10 17:58:50 UTC, 0 replies.
- Choices in Nutch Web interface? - posted by Christopher Bader <cb...@kratylos.com> on 2007/10/10 20:16:58 UTC, 2 replies.
- How to add a field to results? - posted by Sagar Vibhute <sa...@gmail.com> on 2007/10/11 05:30:39 UTC, 0 replies.
- [jira] Commented: (NUTCH-442) Integrate Solr/Nutch - posted by "Enis Soztutar (JIRA)" <ji...@apache.org> on 2007/10/15 17:33:50 UTC, 1 replies.
- [jira] Updated: (NUTCH-488) Avoid parsing uneccessary links and get a more relevant outlink list - posted by "Marcin Okraszewski (JIRA)" <ji...@apache.org> on 2007/10/15 22:26:50 UTC, 0 replies.
- Anyone looked for a better HTML parser? - posted by Doug Cook <na...@candiru.com> on 2007/10/15 22:44:53 UTC, 3 replies.
- [jira] Commented: (NUTCH-488) Avoid parsing uneccessary links and get a more relevant outlink list - posted by "Dennis Kubes (JIRA)" <ji...@apache.org> on 2007/10/16 02:23:50 UTC, 4 replies.
- Selective/Configurable HTML Parsing? - posted by Sagar Vibhute <sa...@gmail.com> on 2007/10/16 08:57:28 UTC, 1 replies.
- [jira] Commented: (NUTCH-436) Incorrect handling of relative paths when the embedded URL path is empty - posted by "Doug Cook (JIRA)" <ji...@apache.org> on 2007/10/16 17:39:50 UTC, 0 replies.
- Cached PDF files? - posted by Sagar Vibhute <sa...@gmail.com> on 2007/10/17 13:29:59 UTC, 0 replies.
- [jira] Created: (NUTCH-567) Proper (?) handling of URIs in TagSoup. - posted by "Dawid Weiss (JIRA)" <ji...@apache.org> on 2007/10/17 14:07:51 UTC, 0 replies.
- [jira] Updated: (NUTCH-567) Proper (?) handling of URIs in TagSoup. - posted by "Dawid Weiss (JIRA)" <ji...@apache.org> on 2007/10/17 14:07:53 UTC, 1 replies.
- writing a new parse-exe plugin - posted by eyal edri <ey...@gmail.com> on 2007/10/17 15:53:54 UTC, 3 replies.
- [jira] Commented: (NUTCH-567) Proper (?) handling of URIs in TagSoup. - posted by "Doug Cook (JIRA)" <ji...@apache.org> on 2007/10/17 17:39:50 UTC, 6 replies.
- Re: writing a new parse-exe plugin [NullPointerException] - posted by eyal edri <ey...@gmail.com> on 2007/10/18 14:01:29 UTC, 0 replies.
- Re: Scoring API issues (LONG) - posted by Sami Siren <ss...@gmail.com> on 2007/10/18 18:21:32 UTC, 1 replies.
- [jira] Resolved: (NUTCH-488) Avoid parsing uneccessary links and get a more relevant outlink list - posted by "Dennis Kubes (JIRA)" <ji...@apache.org> on 2007/10/18 18:55:50 UTC, 0 replies.
- [jira] Closed: (NUTCH-488) Avoid parsing uneccessary links and get a more relevant outlink list - posted by "Dennis Kubes (JIRA)" <ji...@apache.org> on 2007/10/18 18:55:51 UTC, 0 replies.
- JIRA, Resolving and Closing Issues - posted by Dennis Kubes <ku...@apache.org> on 2007/10/18 18:58:04 UTC, 2 replies.
- [jira] Created: (NUTCH-568) Indexer does not update the Lucene "TITLE" field - posted by "smorales (JIRA)" <ji...@apache.org> on 2007/10/19 21:30:50 UTC, 0 replies.
- [jira] Updated: (NUTCH-568) Indexer does not update the Lucene "TITLE" field - posted by "smorales (JIRA)" <ji...@apache.org> on 2007/10/19 21:30:50 UTC, 0 replies.
- Out of order key while in reduce phase - posted by Ned Rockson <nr...@stanford.edu> on 2007/10/20 01:40:30 UTC, 1 replies.
- Nutch/Lucene unique ID for every item crawled? - posted by Sagar Vibhute <sa...@gmail.com> on 2007/10/20 11:24:30 UTC, 3 replies.
- How to write a parse plugin and not get NullPointerException on ParseData - posted by eyal edri <ey...@gmail.com> on 2007/10/21 10:39:57 UTC, 1 replies.
- [jira] Commented: (NUTCH-568) Indexer does not update the Lucene "TITLE" field - posted by "Sami Siren (JIRA)" <ji...@apache.org> on 2007/10/22 19:46:50 UTC, 0 replies.
- [jira] Created: (NUTCH-569) Protocol plugins should report progress to the fetcher - posted by "Andrzej Bialecki (JIRA)" <ji...@apache.org> on 2007/10/23 14:26:51 UTC, 0 replies.
- web2 plugin - posted by karthik085 <ka...@gmail.com> on 2007/10/23 23:25:01 UTC, 0 replies.
- Update to URL ordering from Generator.java - posted by Ned Rockson <ne...@discoveryengine.com> on 2007/10/23 23:41:51 UTC, 5 replies.
- Optimizing nutch crawl for fastest performance - posted by eyal edri <ey...@gmail.com> on 2007/10/24 17:52:51 UTC, 0 replies.
- What are the side effects of running crawl multiple times? - posted by Paolo Castagna <pa...@hp.com> on 2007/10/25 10:31:17 UTC, 1 replies.
- Upgrading Nutch to Hadoop 0.14 or 0.15 - posted by Dennis Kubes <ku...@apache.org> on 2007/10/25 21:11:28 UTC, 2 replies.
- [jira] Updated: (NUTCH-552) Upgrade Nutch to Hadoop 0.15.x - posted by "Dennis Kubes (JIRA)" <ji...@apache.org> on 2007/10/25 22:30:50 UTC, 3 replies.
- [jira] Commented: (NUTCH-501) Implement a different caching mechanism for objects cached in configuration - posted by "Dennis Kubes (JIRA)" <ji...@apache.org> on 2007/10/25 23:26:50 UTC, 3 replies.
- [jira] Created: (NUTCH-570) Improvement of URL Ordering in Generator.java - posted by "Ned Rockson (JIRA)" <ji...@apache.org> on 2007/10/26 02:00:52 UTC, 0 replies.
- [jira] Updated: (NUTCH-570) Improvement of URL Ordering in Generator.java - posted by "Ned Rockson (JIRA)" <ji...@apache.org> on 2007/10/26 03:33:51 UTC, 0 replies.
- Quote Please? - posted by James Phillips <ja...@keypot.com> on 2007/10/26 07:18:30 UTC, 0 replies.
- open source enterprise content search solution based on Nutch - http://nutch-iice.sourceforge.net - posted by joel gump <bi...@gmail.com> on 2007/10/26 12:38:07 UTC, 0 replies.
- nutch to search local filesystem - posted by prem kumar <pr...@gmail.com> on 2007/10/26 16:53:48 UTC, 0 replies.
- [jira] Updated: (NUTCH-548) Move URLNormalizer from Outlink to ParseOutputFormat - posted by "Emmanuel Joke (JIRA)" <ji...@apache.org> on 2007/10/27 19:32:50 UTC, 0 replies.
- [jira] Commented: (NUTCH-552) Upgrade Nutch to Hadoop 0.15.x - posted by "Doğacan Güney (JIRA)" <ji...@apache.org> on 2007/10/28 15:58:50 UTC, 5 replies.
- [jira] Assigned: (NUTCH-501) Implement a different caching mechanism for objects cached in configuration - posted by "Doğacan Güney (JIRA)" <ji...@apache.org> on 2007/10/28 20:27:50 UTC, 0 replies.
- [jira] Updated: (NUTCH-501) Implement a different caching mechanism for objects cached in configuration - posted by "Doğacan Güney (JIRA)" <ji...@apache.org> on 2007/10/28 20:27:51 UTC, 0 replies.
- [jira] Issue Comment Edited: (NUTCH-501) Implement a different caching mechanism for objects cached in configuration - posted by "Doğacan Güney (JIRA)" <ji...@apache.org> on 2007/10/28 20:29:50 UTC, 0 replies.
- Adding new class to nutch - posted by eyal edri <ey...@gmail.com> on 2007/10/29 15:39:40 UTC, 2 replies.
- [jira] Resolved: (NUTCH-501) Implement a different caching mechanism for objects cached in configuration - posted by "Doğacan Güney (JIRA)" <ji...@apache.org> on 2007/10/29 15:59:50 UTC, 0 replies.
- [jira] Updated: (NUTCH-559) NTLM, Basic and Digest Authentication schemes for web/proxy server - posted by "Susam Pal (JIRA)" <ji...@apache.org> on 2007/10/30 11:47:50 UTC, 0 replies.
- How to extract specified information from html? - posted by zhao xiuwen <re...@gmail.com> on 2007/10/31 09:19:14 UTC, 2 replies.
- [jira] Commented: (NUTCH-566) Sun's URL class has bug in creation of relative query URLs - posted by "Doğacan Güney (JIRA)" <ji...@apache.org> on 2007/10/31 18:18:51 UTC, 1 replies.
- [jira] Assigned: (NUTCH-559) NTLM, Basic and Digest Authentication schemes for web/proxy server - posted by "Doğacan Güney (JIRA)" <ji...@apache.org> on 2007/10/31 18:30:51 UTC, 0 replies.
- [jira] Commented: (NUTCH-548) Move URLNormalizer from Outlink to ParseOutputFormat - posted by "Doğacan Güney (JIRA)" <ji...@apache.org> on 2007/10/31 18:38:50 UTC, 0 replies.
- Next move with JIRA ticket - posted by Ned Rockson <ne...@discoveryengine.com> on 2007/10/31 18:42:25 UTC, 2 replies.