You are viewing a plain text version of this content. The canonical link for it is here.
- [jira] Created: (NUTCH-761) Avoid cloningCrawlDatum in CrawlDbReducer - posted by "Julien Nioche (JIRA)" <ji...@apache.org> on 2009/11/03 15:39:59 UTC, 0 replies.
- [jira] Updated: (NUTCH-761) Avoid cloningCrawlDatum in CrawlDbReducer - posted by "Julien Nioche (JIRA)" <ji...@apache.org> on 2009/11/03 15:41:59 UTC, 0 replies.
- [jira] Created: (NUTCH-762) Alternative Generator which can generate several segments in one parse of the crawlDB - posted by "Julien Nioche (JIRA)" <ji...@apache.org> on 2009/11/03 16:03:59 UTC, 0 replies.
- [jira] Updated: (NUTCH-762) Alternative Generator which can generate several segments in one parse of the crawlDB - posted by "Julien Nioche (JIRA)" <ji...@apache.org> on 2009/11/03 16:05:59 UTC, 0 replies.
- Free live video streaming of ApacheCon US 2009 - posted by Michael McCandless <lu...@mikemccandless.com> on 2009/11/04 14:25:25 UTC, 1 replies.
- [Nutch Wiki] Update of "ApacheConUs2009MeetUp" by KenKrugler - posted by Apache Wiki <wi...@apache.org> on 2009/11/04 23:02:35 UTC, 5 replies.
- [Nutch Wiki] Update of "ApacheConUs2009MeetUp" by AndrzejBialecki - posted by Apache Wiki <wi...@apache.org> on 2009/11/04 23:12:11 UTC, 0 replies.
- MergeSegments - map reduce thread death - posted by fa...@butterflycluster.net on 2009/11/05 02:29:14 UTC, 0 replies.
- [jira] Created: (NUTCH-763) Separate configuration files from resources to be included in the job file - posted by "Julien Nioche (JIRA)" <ji...@apache.org> on 2009/11/05 19:34:32 UTC, 0 replies.
- New attachment added to page Presentations on Nutch Wiki - posted by Apache Wiki <wi...@apache.org> on 2009/11/06 18:29:18 UTC, 1 replies.
- [Nutch Wiki] Update of "Presentations" by AndrzejBialecki - posted by Apache Wiki <wi...@apache.org> on 2009/11/06 18:33:56 UTC, 0 replies.
- Build failed in Hudson: Nutch-trunk #985 - posted by Apache Hudson Server <hu...@hudson.zones.apache.org> on 2009/11/07 05:03:16 UTC, 0 replies.
- Hudson build is back to normal: Nutch-trunk #986 - posted by Apache Hudson Server <hu...@hudson.zones.apache.org> on 2009/11/08 06:41:30 UTC, 0 replies.
- [Nutch Wiki] Update of "GettingNutchRunningWithJboss" by TerrenceCurran - posted by Apache Wiki <wi...@apache.org> on 2009/11/10 02:05:57 UTC, 0 replies.
- [Nutch Wiki] Update of "FrontPage" by TerrenceCurran - posted by Apache Wiki <wi...@apache.org> on 2009/11/10 02:07:55 UTC, 0 replies.
- [jira] Created: (NUTCH-764) Add support for vfsfile:// loading of plugins for JBoss - posted by "tcurran@approachingpi.com (JIRA)" <ji...@apache.org> on 2009/11/10 02:25:32 UTC, 0 replies.
- [jira] Updated: (NUTCH-764) Add support for vfsfile:// loading of plugins for JBoss - posted by "tcurran@approachingpi.com (JIRA)" <ji...@apache.org> on 2009/11/10 02:25:32 UTC, 0 replies.
- [jira] Commented: (NUTCH-764) Add support for vfsfile:// loading of plugins for JBoss - posted by "Andrzej Bialecki (JIRA)" <ji...@apache.org> on 2009/11/10 11:18:27 UTC, 2 replies.
- Patch to trunk process - posted by David Stuart <da...@progressivealliance.co.uk> on 2009/11/10 11:52:49 UTC, 3 replies.
- Integration with Tika - posted by BrunoWL <bw...@gmail.com> on 2009/11/10 19:28:24 UTC, 3 replies.
- [jira] Commented: (NUTCH-573) Multiple Domains - Query Search - posted by "Srikarthik Venkataraman (JIRA)" <ji...@apache.org> on 2009/11/11 14:39:39 UTC, 0 replies.
- [jira] Updated: (NUTCH-765) Allow Crawl class to call Either Solr or Lucene Indexer - posted by "Dennis Kubes (JIRA)" <ji...@apache.org> on 2009/11/12 22:00:39 UTC, 0 replies.
- [jira] Created: (NUTCH-765) Allow Crawl class to call Either Solr or Lucene Indexer - posted by "Dennis Kubes (JIRA)" <ji...@apache.org> on 2009/11/12 22:00:39 UTC, 0 replies.
- Treating files of Office 2007 - posted by BrunoWL <bw...@gmail.com> on 2009/11/13 17:32:26 UTC, 0 replies.
- [Nutch Wiki] Update of "RunNutchInEclipse1.0" by AnasElghafari - posted by Apache Wiki <wi...@apache.org> on 2009/11/14 11:28:20 UTC, 0 replies.
- Plugin Help - posted by "david.stuart@progressivealliance.co.uk" <da...@progressivealliance.co.uk> on 2009/11/14 16:40:26 UTC, 2 replies.
- Update on Integration with Tika - posted by Julien Nioche <li...@gmail.com> on 2009/11/16 20:13:32 UTC, 9 replies.
- Filtering Pages while crawling - posted by sumittyagi <pi...@gmail.com> on 2009/11/17 19:48:48 UTC, 0 replies.
- [jira] Created: (NUTCH-766) Tika parser - posted by "Julien Nioche (JIRA)" <ji...@apache.org> on 2009/11/18 15:50:39 UTC, 0 replies.
- [jira] Updated: (NUTCH-766) Tika parser - posted by "Julien Nioche (JIRA)" <ji...@apache.org> on 2009/11/18 15:52:39 UTC, 1 replies.
- [jira] Updated: (NUTCH-767) Update version of Tika for the MimeType detection - posted by "Julien Nioche (JIRA)" <ji...@apache.org> on 2009/11/18 15:58:39 UTC, 0 replies.
- [jira] Created: (NUTCH-767) Update version of Tika for the MimeType detection - posted by "Julien Nioche (JIRA)" <ji...@apache.org> on 2009/11/18 15:58:39 UTC, 0 replies.
- [jira] Assigned: (NUTCH-767) Update version of Tika for the MimeType detection - posted by "Chris A. Mattmann (JIRA)" <ji...@apache.org> on 2009/11/18 16:04:39 UTC, 0 replies.
- [jira] Commented: (NUTCH-767) Update version of Tika for the MimeType detection - posted by "Chris A. Mattmann (JIRA)" <ji...@apache.org> on 2009/11/18 16:04:39 UTC, 0 replies.
- [Nutch Wiki] Update of "NutchHadoopTutorial" by ilgiz - posted by Apache Wiki <wi...@apache.org> on 2009/11/18 18:23:44 UTC, 0 replies.
- [Nutch Wiki] Trivial Update of "NutchHadoopTutorial" by ilgiz - posted by Apache Wiki <wi...@apache.org> on 2009/11/18 18:24:16 UTC, 0 replies.
- Now Hbase 20 - posted by work only <vo...@gmail.com> on 2009/11/20 03:00:52 UTC, 0 replies.
- [jira] Assigned: (NUTCH-765) Allow Crawl class to call Either Solr or Lucene Indexer - posted by "Dennis Kubes (JIRA)" <ji...@apache.org> on 2009/11/22 00:29:39 UTC, 0 replies.
- [jira] Resolved: (NUTCH-765) Allow Crawl class to call Either Solr or Lucene Indexer - posted by "Dennis Kubes (JIRA)" <ji...@apache.org> on 2009/11/22 00:35:39 UTC, 0 replies.
- [jira] Closed: (NUTCH-765) Allow Crawl class to call Either Solr or Lucene Indexer - posted by "Dennis Kubes (JIRA)" <ji...@apache.org> on 2009/11/22 00:35:39 UTC, 0 replies.
- [jira] Created: (NUTCH-768) Upgrade Nutch 1.0 to use Hadoop 0.20 - posted by "Dennis Kubes (JIRA)" <ji...@apache.org> on 2009/11/22 00:39:39 UTC, 0 replies.
- [jira] Created: (NUTCH-769) Fetcher to skip queues for URLS getting repeated exceptions - posted by "Julien Nioche (JIRA)" <ji...@apache.org> on 2009/11/23 12:05:39 UTC, 0 replies.
- [jira] Updated: (NUTCH-769) Fetcher to skip queues for URLS getting repeated exceptions - posted by "Julien Nioche (JIRA)" <ji...@apache.org> on 2009/11/23 12:07:39 UTC, 1 replies.
- [jira] Created: (NUTCH-770) Timebomb for Fetcher - posted by "Julien Nioche (JIRA)" <ji...@apache.org> on 2009/11/23 12:23:39 UTC, 0 replies.
- [jira] Updated: (NUTCH-770) Timebomb for Fetcher - posted by "Julien Nioche (JIRA)" <ji...@apache.org> on 2009/11/23 12:25:39 UTC, 3 replies.
- Plugin Developement Help - posted by "david.stuart@progressivealliance.co.uk" <da...@progressivealliance.co.uk> on 2009/11/24 11:57:13 UTC, 5 replies.
- [Nutch Wiki] Update of "OptimizingCrawls" by DennisKubes - posted by Apache Wiki <wi...@apache.org> on 2009/11/24 16:59:31 UTC, 0 replies.
- [Nutch Wiki] Update of "FrontPage" by DennisKubes - posted by Apache Wiki <wi...@apache.org> on 2009/11/24 17:00:33 UTC, 0 replies.
- wrong wiki front page - posted by Alban Mouton <al...@gmail.com> on 2009/11/24 17:46:21 UTC, 4 replies.
- [jira] Created: (NUTCH-771) Add WebGraph classes to the bin/nutch script - posted by "Dennis Kubes (JIRA)" <ji...@apache.org> on 2009/11/24 21:47:39 UTC, 0 replies.
- [jira] Commented: (NUTCH-768) Upgrade Nutch 1.0 to use Hadoop 0.20 - posted by "Dennis Kubes (JIRA)" <ji...@apache.org> on 2009/11/24 22:17:39 UTC, 1 replies.
- [jira] Commented: (NUTCH-771) Add WebGraph classes to the bin/nutch script - posted by "Andrzej Bialecki (JIRA)" <ji...@apache.org> on 2009/11/24 22:25:39 UTC, 0 replies.
- [jira] Created: (NUTCH-772) Upgrade Nutch to use Lucene 2.9.1 - posted by "Andrzej Bialecki (JIRA)" <ji...@apache.org> on 2009/11/25 13:36:39 UTC, 0 replies.
- [jira] Updated: (NUTCH-772) Upgrade Nutch to use Lucene 2.9.1 - posted by "Andrzej Bialecki (JIRA)" <ji...@apache.org> on 2009/11/25 14:00:39 UTC, 0 replies.
- Re: svn commit: r884075 - /lucene/nutch/trunk/src/java/org/apache/nutch/indexer/solr/SolrIndexer.java - posted by Dennis Kubes <ku...@apache.org> on 2009/11/25 14:36:51 UTC, 3 replies.
- [jira] Created: (NUTCH-773) some minor bugs in AbstractFetchSchedule.java - posted by "Reinhard Schwab (JIRA)" <ji...@apache.org> on 2009/11/25 15:15:39 UTC, 0 replies.
- [jira] Updated: (NUTCH-773) some minor bugs in AbstractFetchSchedule.java - posted by "Reinhard Schwab (JIRA)" <ji...@apache.org> on 2009/11/25 15:19:39 UTC, 1 replies.
- [jira] Updated: (NUTCH-760) Allow field mapping from nutch to solr index - posted by "Andrzej Bialecki (JIRA)" <ji...@apache.org> on 2009/11/25 16:59:39 UTC, 0 replies.
- [jira] Commented: (NUTCH-773) some minor bugs in AbstractFetchSchedule.java - posted by "Andrzej Bialecki (JIRA)" <ji...@apache.org> on 2009/11/25 18:12:39 UTC, 1 replies.
- [jira] Closed: (NUTCH-773) some minor bugs in AbstractFetchSchedule.java - posted by "Andrzej Bialecki (JIRA)" <ji...@apache.org> on 2009/11/25 18:12:39 UTC, 0 replies.
- [jira] Commented: (NUTCH-753) Prevent new Fetcher to retrieve the robots twice - posted by "Andrzej Bialecki (JIRA)" <ji...@apache.org> on 2009/11/25 18:22:39 UTC, 1 replies.
- [jira] Closed: (NUTCH-753) Prevent new Fetcher to retrieve the robots twice - posted by "Andrzej Bialecki (JIRA)" <ji...@apache.org> on 2009/11/25 18:22:39 UTC, 0 replies.
- [jira] Commented: (NUTCH-762) Alternative Generator which can generate several segments in one parse of the crawlDB - posted by "Andrzej Bialecki (JIRA)" <ji...@apache.org> on 2009/11/25 18:38:39 UTC, 0 replies.
- [jira] Commented: (NUTCH-761) Avoid cloningCrawlDatum in CrawlDbReducer - posted by "Andrzej Bialecki (JIRA)" <ji...@apache.org> on 2009/11/25 19:10:39 UTC, 1 replies.
- [jira] Closed: (NUTCH-761) Avoid cloningCrawlDatum in CrawlDbReducer - posted by "Andrzej Bialecki (JIRA)" <ji...@apache.org> on 2009/11/25 19:10:39 UTC, 0 replies.
- [jira] Closed: (NUTCH-760) Allow field mapping from nutch to solr index - posted by "Andrzej Bialecki (JIRA)" <ji...@apache.org> on 2009/11/25 22:00:39 UTC, 0 replies.
- [jira] Commented: (NUTCH-760) Allow field mapping from nutch to solr index - posted by "Andrzej Bialecki (JIRA)" <ji...@apache.org> on 2009/11/25 22:00:39 UTC, 1 replies.
- [jira] Commented: (NUTCH-772) Upgrade Nutch to use Lucene 2.9.1 - posted by "Andrzej Bialecki (JIRA)" <ji...@apache.org> on 2009/11/25 22:18:39 UTC, 1 replies.
- [jira] Closed: (NUTCH-772) Upgrade Nutch to use Lucene 2.9.1 - posted by "Andrzej Bialecki (JIRA)" <ji...@apache.org> on 2009/11/25 22:18:39 UTC, 0 replies.
- [jira] Updated: (NUTCH-768) Upgrade Nutch 1.0 to use Hadoop 0.20 - posted by "Dennis Kubes (JIRA)" <ji...@apache.org> on 2009/11/25 22:34:39 UTC, 0 replies.
- [jira] Resolved: (NUTCH-185) XMLParser is configurable xml parser plugin. - posted by "Chris A. Mattmann (JIRA)" <ji...@apache.org> on 2009/11/26 04:16:39 UTC, 0 replies.
- [Nutch Wiki] Update of "FrontPage" by Davinder - posted by Apache Wiki <wi...@apache.org> on 2009/11/27 18:49:19 UTC, 3 replies.
- [jira] Commented: (NUTCH-765) Allow Crawl class to call Either Solr or Lucene Indexer - posted by "Hudson (JIRA)" <ji...@apache.org> on 2009/11/28 15:09:20 UTC, 0 replies.
- [jira] Commented: (NUTCH-769) Fetcher to skip queues for URLS getting repeated exceptions - posted by "Andrzej Bialecki (JIRA)" <ji...@apache.org> on 2009/11/28 16:37:20 UTC, 1 replies.
- [jira] Commented: (NUTCH-770) Timebomb for Fetcher - posted by "Julien Nioche (JIRA)" <ji...@apache.org> on 2009/11/28 17:05:20 UTC, 4 replies.
- [jira] Commented: (NUTCH-746) NutchBeanConstructor does not close NutchBean upon contextDestroyed, causing resource leak in the container. - posted by "Andrzej Bialecki (JIRA)" <ji...@apache.org> on 2009/11/28 22:19:20 UTC, 1 replies.
- [jira] Closed: (NUTCH-746) NutchBeanConstructor does not close NutchBean upon contextDestroyed, causing resource leak in the container. - posted by "Andrzej Bialecki (JIRA)" <ji...@apache.org> on 2009/11/28 22:19:20 UTC, 0 replies.
- [jira] Commented: (NUTCH-738) Close SegmentUpdater when FetchedSegments is closed - posted by "Andrzej Bialecki (JIRA)" <ji...@apache.org> on 2009/11/28 22:29:20 UTC, 1 replies.
- [jira] Closed: (NUTCH-738) Close SegmentUpdater when FetchedSegments is closed - posted by "Andrzej Bialecki (JIRA)" <ji...@apache.org> on 2009/11/28 22:29:20 UTC, 0 replies.
- [jira] Closed: (NUTCH-739) SolrDeleteDuplications too slow when using hadoop - posted by "Andrzej Bialecki (JIRA)" <ji...@apache.org> on 2009/11/28 22:37:20 UTC, 0 replies.
- [jira] Commented: (NUTCH-739) SolrDeleteDuplications too slow when using hadoop - posted by "Andrzej Bialecki (JIRA)" <ji...@apache.org> on 2009/11/28 22:37:20 UTC, 1 replies.
- [jira] Closed: (NUTCH-755) DomainURLFilter crashes on malformed URL - posted by "Andrzej Bialecki (JIRA)" <ji...@apache.org> on 2009/11/28 23:20:22 UTC, 0 replies.
- [jira] Commented: (NUTCH-755) DomainURLFilter crashes on malformed URL - posted by "Andrzej Bialecki (JIRA)" <ji...@apache.org> on 2009/11/28 23:20:23 UTC, 0 replies.
- [jira] Commented: (NUTCH-692) AlreadyBeingCreatedException with Hadoop 0.19 - posted by "Andrzej Bialecki (JIRA)" <ji...@apache.org> on 2009/11/28 23:24:20 UTC, 1 replies.
- [jira] Closed: (NUTCH-741) Job file includes multiple copies of nutch config files. - posted by "Andrzej Bialecki (JIRA)" <ji...@apache.org> on 2009/11/28 23:32:20 UTC, 0 replies.
- [jira] Commented: (NUTCH-741) Job file includes multiple copies of nutch config files. - posted by "Andrzej Bialecki (JIRA)" <ji...@apache.org> on 2009/11/28 23:32:20 UTC, 1 replies.
- [jira] Closed: (NUTCH-712) ParseOutputFormat should catch java.net.MalformedURLException coming from normalizers - posted by "Andrzej Bialecki (JIRA)" <ji...@apache.org> on 2009/11/28 23:44:20 UTC, 0 replies.
- [jira] Commented: (NUTCH-712) ParseOutputFormat should catch java.net.MalformedURLException coming from normalizers - posted by "Andrzej Bialecki (JIRA)" <ji...@apache.org> on 2009/11/28 23:44:20 UTC, 1 replies.
- [Nutch Wiki] Trivial Update of "Automating_Fetches_with_Python" by newacct - posted by Apache Wiki <wi...@apache.org> on 2009/11/29 04:19:00 UTC, 0 replies.
- [jira] Issue Comment Edited: (NUTCH-770) Timebomb for Fetcher - posted by "MilleBii (JIRA)" <ji...@apache.org> on 2009/11/29 21:47:20 UTC, 0 replies.