You are viewing a plain text version of this content. The canonical link for it is here.
- [Nutch Wiki] Trivial Update of "PluginCentral" by LewisJohnMcgibbney - posted by Apache Wiki <wi...@apache.org> on 2013/07/01 02:22:11 UTC, 0 replies.
- Build failed in Jenkins: Nutch-trunk #2262 - posted by Apache Jenkins Server <je...@builds.apache.org> on 2013/07/01 06:05:01 UTC, 0 replies.
- [jira] [Resolved] (NUTCH-1593) normalize option missing in SegmentMerger's usage - posted by "Markus Jelsma (JIRA)" <ji...@apache.org> on 2013/07/01 12:04:20 UTC, 0 replies.
- [jira] [Commented] (NUTCH-1581) CrawlDB csv output to include metadata - posted by "Markus Jelsma (JIRA)" <ji...@apache.org> on 2013/07/01 12:08:20 UTC, 2 replies.
- [jira] [Commented] (NUTCH-1327) QueryStringNormalizer - posted by "Markus Jelsma (JIRA)" <ji...@apache.org> on 2013/07/01 12:30:20 UTC, 3 replies.
- Jenkins build is back to normal : Nutch-trunk #2263 - posted by Apache Jenkins Server <je...@builds.apache.org> on 2013/07/01 13:33:41 UTC, 0 replies.
- [jira] [Commented] (NUTCH-1593) normalize option missing in SegmentMerger's usage - posted by "Hudson (JIRA)" <ji...@apache.org> on 2013/07/01 13:34:22 UTC, 0 replies.
- Adding nutch stage - posted by Ahmet Emre Aladağ <em...@agmlab.com> on 2013/07/01 14:31:43 UTC, 2 replies.
- [jira] [Commented] (NUTCH-1594) count variable is never changed in ParseUtil class - posted by "lufeng (JIRA)" <ji...@apache.org> on 2013/07/01 15:35:20 UTC, 1 replies.
- [jira] [Updated] (NUTCH-1327) QueryStringNormalizer - posted by "Markus Jelsma (JIRA)" <ji...@apache.org> on 2013/07/01 16:56:20 UTC, 0 replies.
- [jira] [Resolved] (NUTCH-1581) CrawlDB csv output to include metadata - posted by "Markus Jelsma (JIRA)" <ji...@apache.org> on 2013/07/02 10:38:20 UTC, 0 replies.
- [jira] [Resolved] (NUTCH-1327) QueryStringNormalizer - posted by "Markus Jelsma (JIRA)" <ji...@apache.org> on 2013/07/02 10:38:20 UTC, 0 replies.
- [jira] [Created] (NUTCH-1595) Upgrade to Tika 1.4 - posted by "Markus Jelsma (JIRA)" <ji...@apache.org> on 2013/07/02 10:44:19 UTC, 0 replies.
- [jira] [Updated] (NUTCH-1595) Upgrade to Tika 1.4 - posted by "Markus Jelsma (JIRA)" <ji...@apache.org> on 2013/07/02 10:54:20 UTC, 5 replies.
- [jira] [Created] (NUTCH-1596) NodeWalker NPE on next node - posted by "Markus Jelsma (JIRA)" <ji...@apache.org> on 2013/07/02 11:10:19 UTC, 0 replies.
- [jira] [Created] (NUTCH-1597) HeadingsParseFilter to trim and remove exess whitespace - posted by "Markus Jelsma (JIRA)" <ji...@apache.org> on 2013/07/02 11:12:20 UTC, 0 replies.
- [jira] [Updated] (NUTCH-1597) HeadingsParseFilter to trim and remove exess whitespace - posted by "Markus Jelsma (JIRA)" <ji...@apache.org> on 2013/07/02 11:16:25 UTC, 0 replies.
- [jira] [Updated] (NUTCH-1586) Non-db_success records should have interval.max - posted by "Markus Jelsma (JIRA)" <ji...@apache.org> on 2013/07/02 13:47:20 UTC, 1 replies.
- [jira] [Created] (NUTCH-1598) ElasticSearchIndexer to read ImmutableSettings from config - posted by "Markus Jelsma (JIRA)" <ji...@apache.org> on 2013/07/02 14:54:20 UTC, 0 replies.
- [jira] [Updated] (NUTCH-1598) ElasticSearchIndexer to read ImmutableSettings from config - posted by "Markus Jelsma (JIRA)" <ji...@apache.org> on 2013/07/02 14:54:21 UTC, 2 replies.
- Re: [VOTE] Apache Nutch 2.2.1 RC#1 - posted by Lewis John Mcgibbney <le...@gmail.com> on 2013/07/02 18:08:25 UTC, 0 replies.
- [RESULT] WAS Re: [VOTE] Apache Nutch 2.2.1 RC#1 - posted by Lewis John Mcgibbney <le...@gmail.com> on 2013/07/02 18:28:17 UTC, 0 replies.
- [ANNOUNCE] Apache Nutch v2.2.1 Released - posted by Lewis John Mcgibbney <le...@gmail.com> on 2013/07/02 18:32:00 UTC, 2 replies.
- [jira] [Created] (NUTCH-1599) Obtain consensus on new description of Nutch - posted by "Lewis John McGibbney (JIRA)" <ji...@apache.org> on 2013/07/02 18:51:20 UTC, 0 replies.
- [jira] [Assigned] (NUTCH-1599) Obtain consensus on new description of Nutch - posted by "Lewis John McGibbney (JIRA)" <ji...@apache.org> on 2013/07/02 18:51:21 UTC, 0 replies.
- [jira] [Updated] (NUTCH-1599) Obtain consensus on new description of Nutch - posted by "Lewis John McGibbney (JIRA)" <ji...@apache.org> on 2013/07/02 18:51:22 UTC, 0 replies.
- [jira] [Commented] (NUTCH-1599) Obtain consensus on new description of Nutch - posted by "Julien Nioche (JIRA)" <ji...@apache.org> on 2013/07/02 19:43:21 UTC, 3 replies.
- Re: Inlinks not being saved in the database - posted by brian4 <bq...@gmail.com> on 2013/07/02 20:50:26 UTC, 0 replies.
- [jira] [Created] (NUTCH-1600) Injector overwrite does not always work properly - posted by "Markus Jelsma (JIRA)" <ji...@apache.org> on 2013/07/03 10:38:19 UTC, 0 replies.
- [jira] [Updated] (NUTCH-1600) Injector overwrite does not always work properly - posted by "Markus Jelsma (JIRA)" <ji...@apache.org> on 2013/07/03 10:40:20 UTC, 0 replies.
- [jira] [Created] (NUTCH-1601) ElasticSearchIndexer fails to properly delete documents - posted by "Markus Jelsma (JIRA)" <ji...@apache.org> on 2013/07/03 12:49:20 UTC, 0 replies.
- [jira] [Updated] (NUTCH-1601) ElasticSearchIndexer fails to properly delete documents - posted by "Markus Jelsma (JIRA)" <ji...@apache.org> on 2013/07/03 12:59:19 UTC, 0 replies.
- [jira] [Commented] (NUTCH-1595) Upgrade to Tika 1.4 - posted by "Julien Nioche (JIRA)" <ji...@apache.org> on 2013/07/03 13:03:20 UTC, 6 replies.
- [jira] [Commented] (NUTCH-1600) Injector overwrite does not always work properly - posted by "lufeng (JIRA)" <ji...@apache.org> on 2013/07/03 16:44:21 UTC, 1 replies.
- [jira] [Created] (NUTCH-1602) improve the readability of metadata in readdb dump normal - posted by "lufeng (JIRA)" <ji...@apache.org> on 2013/07/03 16:54:21 UTC, 0 replies.
- [jira] [Updated] (NUTCH-1602) improve the readability of metadata in readdb dump normal - posted by "lufeng (JIRA)" <ji...@apache.org> on 2013/07/03 16:56:19 UTC, 1 replies.
- [jira] [Commented] (NUTCH-1602) improve the readability of metadata in readdb dump normal - posted by "Markus Jelsma (JIRA)" <ji...@apache.org> on 2013/07/03 17:00:26 UTC, 7 replies.
- [jira] [Commented] (NUTCH-1524) Internal links are not being saved even with change in parameter (db.ignore.internal.links) - posted by "Brian (JIRA)" <ji...@apache.org> on 2013/07/03 18:12:22 UTC, 0 replies.
- [jira] [Comment Edited] (NUTCH-1524) Internal links are not being saved even with change in parameter (db.ignore.internal.links) - posted by "Brian (JIRA)" <ji...@apache.org> on 2013/07/03 18:14:20 UTC, 1 replies.
- [jira] [Updated] (NUTCH-1524) Internal links are not being saved even with change in parameter (db.ignore.internal.links) - posted by "Brian (JIRA)" <ji...@apache.org> on 2013/07/03 19:22:22 UTC, 0 replies.
- [jira] [Resolved] (NUTCH-1599) Obtain consensus on new description of Nutch - posted by "Lewis John McGibbney (JIRA)" <ji...@apache.org> on 2013/07/03 21:32:21 UTC, 0 replies.
- [jira] [Updated] (NUTCH-1596) NodeWalker NPE on next node - posted by "Sebastian Nagel (JIRA)" <ji...@apache.org> on 2013/07/03 23:56:20 UTC, 0 replies.
- [jira] [Commented] (NUTCH-1596) NodeWalker NPE on next node - posted by "Markus Jelsma (JIRA)" <ji...@apache.org> on 2013/07/04 00:08:20 UTC, 0 replies.
- Build failed in Jenkins: Nutch-trunk #2267 - posted by Apache Jenkins Server <je...@builds.apache.org> on 2013/07/04 06:04:15 UTC, 0 replies.
- [jira] [Updated] (NUTCH-1596) HeadingsParseFilter not thread safe - posted by "Markus Jelsma (JIRA)" <ji...@apache.org> on 2013/07/04 10:45:25 UTC, 0 replies.
- [jira] [Resolved] (NUTCH-1600) Injector overwrite does not always work properly - posted by "Markus Jelsma (JIRA)" <ji...@apache.org> on 2013/07/04 10:51:20 UTC, 0 replies.
- [jira] [Resolved] (NUTCH-1601) ElasticSearchIndexer fails to properly delete documents - posted by "Markus Jelsma (JIRA)" <ji...@apache.org> on 2013/07/04 10:59:22 UTC, 0 replies.
- Jenkins build is back to normal : Nutch-trunk #2268 - posted by Apache Jenkins Server <je...@builds.apache.org> on 2013/07/04 11:08:03 UTC, 0 replies.
- [jira] [Resolved] (NUTCH-1597) HeadingsParseFilter to trim and remove exess whitespace - posted by "Markus Jelsma (JIRA)" <ji...@apache.org> on 2013/07/04 11:09:20 UTC, 0 replies.
- [jira] [Commented] (NUTCH-1601) ElasticSearchIndexer fails to properly delete documents - posted by "Hudson (JIRA)" <ji...@apache.org> on 2013/07/04 11:09:21 UTC, 0 replies.
- [jira] [Commented] (NUTCH-1597) HeadingsParseFilter to trim and remove exess whitespace - posted by "Hudson (JIRA)" <ji...@apache.org> on 2013/07/04 12:11:21 UTC, 0 replies.
- [jira] [Resolved] (NUTCH-1596) HeadingsParseFilter not thread safe - posted by "Markus Jelsma (JIRA)" <ji...@apache.org> on 2013/07/04 13:15:47 UTC, 0 replies.
- [jira] [Commented] (NUTCH-1596) HeadingsParseFilter not thread safe - posted by "Hudson (JIRA)" <ji...@apache.org> on 2013/07/04 14:09:48 UTC, 0 replies.
- [jira] [Commented] (NUTCH-1598) ElasticSearchIndexer to read ImmutableSettings from config - posted by "Markus Jelsma (JIRA)" <ji...@apache.org> on 2013/07/04 14:55:48 UTC, 1 replies.
- [jira] [Commented] (NUTCH-1520) SegmentMerger looses records - posted by "Markus Jelsma (JIRA)" <ji...@apache.org> on 2013/07/04 15:15:49 UTC, 2 replies.
- [jira] [Resolved] (NUTCH-1602) improve the readability of metadata in readdb dump normal - posted by "lufeng (JIRA)" <ji...@apache.org> on 2013/07/04 17:09:48 UTC, 0 replies.
- nutch and solr's source code explanatory - posted by Mustafa Elkhiat <me...@gmail.com> on 2013/07/04 22:08:18 UTC, 0 replies.
- [jira] [Resolved] (NUTCH-1520) SegmentMerger looses records - posted by "Markus Jelsma (JIRA)" <ji...@apache.org> on 2013/07/05 10:53:49 UTC, 0 replies.
- [jira] [Resolved] (NUTCH-1598) ElasticSearchIndexer to read ImmutableSettings from config - posted by "Markus Jelsma (JIRA)" <ji...@apache.org> on 2013/07/05 11:05:48 UTC, 0 replies.
- [jira] [Resolved] (NUTCH-1595) Upgrade to Tika 1.4 - posted by "Markus Jelsma (JIRA)" <ji...@apache.org> on 2013/07/05 12:29:49 UTC, 0 replies.
- Build failed in Jenkins: Nutch-nutchgora #674 - posted by Apache Jenkins Server <je...@builds.apache.org> on 2013/07/05 13:04:34 UTC, 0 replies.
- Build failed in Jenkins: Nutch-trunk #2274 - posted by Apache Jenkins Server <je...@builds.apache.org> on 2013/07/05 13:04:34 UTC, 0 replies.
- [jira] [Updated] (NUTCH-1603) ZIP parser complains about truncated PDF file - posted by "Sebastian Nagel (JIRA)" <ji...@apache.org> on 2013/07/05 15:03:48 UTC, 0 replies.
- [jira] [Created] (NUTCH-1603) ZIP parser complains about truncated PDF file - posted by "Sebastian Nagel (JIRA)" <ji...@apache.org> on 2013/07/05 15:03:48 UTC, 0 replies.
- [jira] [Updated] (NUTCH-1124) JUnit test for scoring-opic - posted by "Talat UYARER (JIRA)" <ji...@apache.org> on 2013/07/05 16:24:05 UTC, 8 replies.
- [jira] [Created] (NUTCH-1604) ProtocolFactory not thread-safe - posted by "Julien Nioche (JIRA)" <ji...@apache.org> on 2013/07/05 16:28:07 UTC, 0 replies.
- [jira] [Updated] (NUTCH-1604) ProtocolFactory not thread-safe - posted by "Julien Nioche (JIRA)" <ji...@apache.org> on 2013/07/05 16:28:08 UTC, 1 replies.
- [jira] [Commented] (NUTCH-1406) index-metadata plugin: conversion to Solr date format - posted by "Antoinette (JIRA)" <ji...@apache.org> on 2013/07/05 16:43:49 UTC, 0 replies.
- [jira] [Commented] (NUTCH-1457) Nutch2 Refactor the update process so that fetched items are only processed once - posted by "Riyaz Shaik (JIRA)" <ji...@apache.org> on 2013/07/05 18:15:50 UTC, 9 replies.
- Jenkins build is back to normal : Nutch-nutchgora #675 - posted by Apache Jenkins Server <je...@builds.apache.org> on 2013/07/06 06:10:12 UTC, 0 replies.
- Jenkins build is back to normal : Nutch-trunk #2275 - posted by Apache Jenkins Server <je...@builds.apache.org> on 2013/07/06 06:15:18 UTC, 0 replies.
- nutch 1.6 api documentation - posted by Mustafa Elkhiat <me...@gmail.com> on 2013/07/06 15:52:11 UTC, 1 replies.
- change the layout of search page of solr - posted by Mustafa Elkhiat <me...@gmail.com> on 2013/07/06 20:23:56 UTC, 0 replies.
- [jira] [Updated] (NUTCH-1591) Incorrect conversion of ByteBuffer to String - posted by "Jason Howes (JIRA)" <ji...@apache.org> on 2013/07/07 08:23:49 UTC, 0 replies.
- [jira] [Created] (NUTCH-1605) mime type detector recognizes xlsx as zip file - posted by "Sebastian Nagel (JIRA)" <ji...@apache.org> on 2013/07/07 22:31:48 UTC, 0 replies.
- [jira] [Updated] (NUTCH-1605) mime type detector recognizes xlsx as zip file - posted by "Sebastian Nagel (JIRA)" <ji...@apache.org> on 2013/07/07 22:53:48 UTC, 0 replies.
- [jira] [Commented] (NUTCH-1486) Upgrade to Solr 4.3.0 - posted by "Markus Jelsma (JIRA)" <ji...@apache.org> on 2013/07/08 09:59:48 UTC, 0 replies.
- [jira] [Commented] (NUTCH-1604) ProtocolFactory not thread-safe - posted by "Markus Jelsma (JIRA)" <ji...@apache.org> on 2013/07/08 10:45:49 UTC, 2 replies.
- [jira] [Resolved] (NUTCH-1604) ProtocolFactory not thread-safe - posted by "Julien Nioche (JIRA)" <ji...@apache.org> on 2013/07/08 10:51:49 UTC, 0 replies.
- [jira] [Created] (NUTCH-1606) Check that Factory classes use the cache in a thread safe way - posted by "Julien Nioche (JIRA)" <ji...@apache.org> on 2013/07/08 10:59:48 UTC, 0 replies.
- [jira] [Created] (NUTCH-1607) Make inproper multiValued field configurable - posted by "-christian (JIRA)" <ji...@apache.org> on 2013/07/08 13:33:48 UTC, 0 replies.
- [jira] [Updated] (NUTCH-1607) Make inproper multiValued field configurable - posted by "Markus Jelsma (JIRA)" <ji...@apache.org> on 2013/07/08 13:43:49 UTC, 0 replies.
- [jira] [Comment Edited] (NUTCH-1457) Nutch2 Refactor the update process so that fetched items are only processed once - posted by "rerngvit yanggratoke (JIRA)" <ji...@apache.org> on 2013/07/09 00:27:48 UTC, 4 replies.
- [jira] [Updated] (NUTCH-1486) Upgrade to Solr 4.3.0 - posted by "Markus Jelsma (JIRA)" <ji...@apache.org> on 2013/07/09 10:27:49 UTC, 0 replies.
- [jira] [Updated] (NUTCH-1377) Add option to index via CloudSolrServer instead - posted by "Markus Jelsma (JIRA)" <ji...@apache.org> on 2013/07/09 12:09:48 UTC, 2 replies.
- [jira] [Created] (NUTCH-1608) SolrDeleteDuplicates bug: choosing preferred page when duplicates does not work - posted by "Brian (JIRA)" <ji...@apache.org> on 2013/07/09 17:15:50 UTC, 0 replies.
- [jira] [Updated] (NUTCH-1608) SolrDeleteDuplicates bug: choosing preferred page when duplicates does not work - posted by "Brian (JIRA)" <ji...@apache.org> on 2013/07/09 17:21:48 UTC, 1 replies.
- [jira] [Updated] (NUTCH-1457) Nutch2 Refactor the update process so that fetched items are only processed once - posted by "senthil kumar (JIRA)" <ji...@apache.org> on 2013/07/10 13:17:49 UTC, 1 replies.
- [jira] [Commented] (NUTCH-1124) JUnit test for scoring-opic - posted by "Lewis John McGibbney (JIRA)" <ji...@apache.org> on 2013/07/10 22:09:49 UTC, 2 replies.
- [jira] [Commented] (NUTCH-1605) mime type detector recognizes xlsx as zip file - posted by "Sebastian Nagel (JIRA)" <ji...@apache.org> on 2013/07/10 23:11:48 UTC, 1 replies.
- [jira] [Created] (NUTCH-1609) java.net.MalformedURLException when running nutch crawl with apache-nutch-2.1.jar with hadoop - posted by "vishal toshniwal (JIRA)" <ji...@apache.org> on 2013/07/11 15:35:49 UTC, 0 replies.
- [jira] [Comment Edited] (NUTCH-1124) JUnit test for scoring-opic - posted by "Talat UYARER (JIRA)" <ji...@apache.org> on 2013/07/12 14:49:48 UTC, 0 replies.
- [jira] [Resolved] (NUTCH-1609) java.net.MalformedURLException when running nutch crawl with apache-nutch-2.1.jar with hadoop - posted by "Lewis John McGibbney (JIRA)" <ji...@apache.org> on 2013/07/12 20:30:10 UTC, 0 replies.
- [jira] [Created] (NUTCH-1610) Can't run individual unit tests for plugins in nutch 2.x - posted by "Brian (JIRA)" <ji...@apache.org> on 2013/07/12 23:19:47 UTC, 0 replies.
- [jira] [Issue Comment Deleted] (NUTCH-1124) JUnit test for scoring-opic - posted by "Talat UYARER (JIRA)" <ji...@apache.org> on 2013/07/13 13:59:49 UTC, 0 replies.
- [jira] [Commented] (NUTCH-1610) Can't run individual unit tests for plugins in nutch 2.x - posted by "Sebastian Nagel (JIRA)" <ji...@apache.org> on 2013/07/13 17:49:51 UTC, 0 replies.
- [jira] [Commented] (NUTCH-1609) java.net.MalformedURLException when running nutch crawl with apache-nutch-2.1.jar with hadoop - posted by "vishal toshniwal (JIRA)" <ji...@apache.org> on 2013/07/14 07:04:49 UTC, 2 replies.
- Build failed in Jenkins: Nutch-trunk #2285 - posted by Apache Jenkins Server <je...@builds.apache.org> on 2013/07/15 06:04:48 UTC, 0 replies.
- [jira] [Created] (NUTCH-1611) Elastic Search Indexer Creates field in elastic search "boost" as a string value, so cannot be used in custom boost queries - posted by "Nicholas Waltham (JIRA)" <ji...@apache.org> on 2013/07/15 10:14:48 UTC, 0 replies.
- [jira] [Commented] (NUTCH-1611) Elastic Search Indexer Creates field in elastic search "boost" as a string value, so cannot be used in custom boost queries - posted by "Markus Jelsma (JIRA)" <ji...@apache.org> on 2013/07/15 11:08:49 UTC, 0 replies.
- [jira] [Updated] (NUTCH-1611) Elastic Search Indexer Creates field in elastic search "boost" as a string value, so cannot be used in custom boost queries - posted by "Markus Jelsma (JIRA)" <ji...@apache.org> on 2013/07/15 11:32:48 UTC, 1 replies.
- [jira] [Commented] (NUTCH-1084) ReadDB url throws exception - posted by "Markus Jelsma (JIRA)" <ji...@apache.org> on 2013/07/15 14:38:49 UTC, 0 replies.
- Jenkins build is back to normal : Nutch-trunk #2286 - posted by Apache Jenkins Server <je...@builds.apache.org> on 2013/07/16 06:12:28 UTC, 0 replies.
- [jira] [Created] (NUTCH-1612) Getting URl Malformed exception with Nutch 2.2 and Hadoop 1.0.3 - posted by "Amit Yadav (JIRA)" <ji...@apache.org> on 2013/07/16 07:50:49 UTC, 0 replies.
- [jira] [Created] (NUTCH-1613) Timeouts in protocol-httpclient when crawling same host with >2 threads and added cookie strings for both http protocols - posted by "Brian (JIRA)" <ji...@apache.org> on 2013/07/16 23:46:49 UTC, 0 replies.
- [jira] [Updated] (NUTCH-1613) Timeouts in protocol-httpclient when crawling same host with >2 threads and added cookie strings for both http protocols - posted by "Brian (JIRA)" <ji...@apache.org> on 2013/07/16 23:50:50 UTC, 1 replies.
- [jira] [Updated] (NUTCH-1610) Can't run individual unit tests for plugins in nutch 2.x - posted by "Lewis John McGibbney (JIRA)" <ji...@apache.org> on 2013/07/17 00:04:49 UTC, 0 replies.
- [jira] [Updated] (NUTCH-1228) Change mapred.task.timeout to mapreduce.task.timeout in fetcher - posted by "Gül Ahmet Türkoğlu (JIRA)" <ji...@apache.org> on 2013/07/17 12:45:09 UTC, 3 replies.
- [jira] [Commented] (NUTCH-1613) Timeouts in protocol-httpclient when crawling same host with >2 threads and added cookie strings for both http protocols - posted by "lufeng (JIRA)" <ji...@apache.org> on 2013/07/17 16:54:48 UTC, 2 replies.
- [jira] [Created] (NUTCH-1614) Plugin to exclude URLs matching regex list from indexing - to enable crawl but do not index - posted by "Brian (JIRA)" <ji...@apache.org> on 2013/07/17 19:12:50 UTC, 0 replies.
- [jira] [Updated] (NUTCH-1614) Plugin to exclude URLs matching regex list from indexing - to enable crawl but do not index - posted by "Brian (JIRA)" <ji...@apache.org> on 2013/07/17 19:16:49 UTC, 0 replies.
- [jira] [Commented] (NUTCH-1614) Plugin to exclude URLs matching regex list from indexing - to enable crawl but do not index - posted by "Markus Jelsma (JIRA)" <ji...@apache.org> on 2013/07/17 19:22:48 UTC, 2 replies.
- [jira] [Comment Edited] (NUTCH-1614) Plugin to exclude URLs matching regex list from indexing - to enable crawl but do not index - posted by "Brian (JIRA)" <ji...@apache.org> on 2013/07/17 20:29:48 UTC, 0 replies.
- [jira] [Reopened] (NUTCH-1300) Indexer to normalize URL's - posted by "Markus Jelsma (JIRA)" <ji...@apache.org> on 2013/07/17 20:39:46 UTC, 0 replies.
- [jira] [Updated] (NUTCH-1300) Indexer to filter and normalize URL's - posted by "Markus Jelsma (JIRA)" <ji...@apache.org> on 2013/07/17 20:39:47 UTC, 0 replies.
- [jira] [Resolved] (NUTCH-1300) Indexer to filter and normalize URL's - posted by "Markus Jelsma (JIRA)" <ji...@apache.org> on 2013/07/17 20:39:47 UTC, 0 replies.
- [jira] [Closed] (NUTCH-1612) Getting URl Malformed exception with Nutch 2.2 and Hadoop 1.0.3 - posted by "Lewis John McGibbney (JIRA)" <ji...@apache.org> on 2013/07/17 21:15:47 UTC, 0 replies.
- [jira] [Commented] (NUTCH-1228) Change mapred.task.timeout to mapreduce.task.timeout in fetcher - posted by "Lewis John McGibbney (JIRA)" <ji...@apache.org> on 2013/07/17 21:17:48 UTC, 2 replies.
- [jira] [Commented] (NUTCH-1612) Getting URl Malformed exception with Nutch 2.2 and Hadoop 1.0.3 - posted by "Markus Jelsma (JIRA)" <ji...@apache.org> on 2013/07/17 22:59:46 UTC, 0 replies.
- [jira] [Created] (NUTCH-1615) Implementing A Feature for Fetching From Websites Dump - posted by "cihad güzel (JIRA)" <ji...@apache.org> on 2013/07/19 14:58:48 UTC, 0 replies.
- [jira] [Commented] (NUTCH-1615) Implementing A Feature for Fetching From Websites Dump - posted by "cihad güzel (JIRA)" <ji...@apache.org> on 2013/07/19 15:02:48 UTC, 0 replies.
- [jira] [Created] (NUTCH-1616) SegmentMerger missing proper crawl_fetch datum - posted by "Markus Jelsma (JIRA)" <ji...@apache.org> on 2013/07/19 16:08:49 UTC, 0 replies.
- [jira] [Commented] (NUTCH-1616) SegmentMerger missing proper crawl_fetch datum - posted by "Markus Jelsma (JIRA)" <ji...@apache.org> on 2013/07/19 16:34:49 UTC, 8 replies.
- hello - posted by Arvind Srini <ar...@gmail.com> on 2013/07/23 03:24:46 UTC, 0 replies.
- hey. - posted by Arvind Srini <ar...@gmail.com> on 2013/07/23 04:20:06 UTC, 0 replies.
- [jira] [Created] (NUTCH-1617) IndexerMapReduce to consider latest fetchDatum - posted by "Markus Jelsma (JIRA)" <ji...@apache.org> on 2013/07/24 16:17:50 UTC, 0 replies.
- [jira] [Commented] (NUTCH-1473) Column length too big for column 'text' (max = 21845); use BLOB or TEXT instead - posted by "Brian (JIRA)" <ji...@apache.org> on 2013/07/24 21:53:50 UTC, 0 replies.
- [Nutch Wiki] Trivial Update of "Nutch2Tutorial" by LewisJohnMcgibbney - posted by Apache Wiki <wi...@apache.org> on 2013/07/24 23:16:07 UTC, 0 replies.
- [jira] [Created] (NUTCH-1618) Fetches some websites multiple times for long lasting queues - posted by "Talat UYARER (JIRA)" <ji...@apache.org> on 2013/07/25 10:59:53 UTC, 0 replies.
- [jira] [Updated] (NUTCH-1618) Fetches some websites multiple times for long lasting queues - posted by "Talat UYARER (JIRA)" <ji...@apache.org> on 2013/07/25 11:01:59 UTC, 2 replies.
- [jira] [Commented] (NUTCH-1617) IndexerMapReduce to consider latest fetchDatum - posted by "Markus Jelsma (JIRA)" <ji...@apache.org> on 2013/07/25 12:03:47 UTC, 0 replies.
- [jira] [Resolved] (NUTCH-1587) misspelled property "threshold" in conf/log4j.properties - posted by "Sebastian Nagel (JIRA)" <ji...@apache.org> on 2013/07/25 23:19:48 UTC, 0 replies.
- Nutch Downloads not available - posted by Walter Tietze <ti...@neofonie.de> on 2013/07/26 18:14:14 UTC, 0 replies.
- [jira] [Commented] (NUTCH-1465) Support sitemaps in Nutch - posted by "Brian (JIRA)" <ji...@apache.org> on 2013/07/26 20:39:51 UTC, 0 replies.
- Re: dev Digest 28 Jul 2013 16:44:00 -0000 Issue 1657 - posted by Lewis John Mcgibbney <le...@gmail.com> on 2013/07/29 00:15:54 UTC, 0 replies.
- [jira] [Updated] (NUTCH-1317) Max content length by MIME-type - posted by "cihad güzel (JIRA)" <ji...@apache.org> on 2013/07/29 13:13:48 UTC, 0 replies.
- [jira] [Commented] (NUTCH-1317) Max content length by MIME-type - posted by "cihad güzel (JIRA)" <ji...@apache.org> on 2013/07/29 13:23:49 UTC, 0 replies.
- [jira] [Updated] (NUTCH-1616) SegmentMerger missing proper crawl_fetch datum - posted by "Markus Jelsma (JIRA)" <ji...@apache.org> on 2013/07/29 14:25:49 UTC, 0 replies.
- [jira] [Resolved] (NUTCH-806) Merge CrawlDBScanner with CrawlDBReader - posted by "Julien Nioche (JIRA)" <ji...@apache.org> on 2013/07/29 15:41:48 UTC, 0 replies.
- [jira] [Updated] (NUTCH-1413) Fetcher to record response time - posted by "Talat UYARER (JIRA)" <ji...@apache.org> on 2013/07/31 09:23:52 UTC, 1 replies.
- [jira] [Commented] (NUTCH-1413) Fetcher to record response time - posted by "Talat UYARER (JIRA)" <ji...@apache.org> on 2013/07/31 09:23:53 UTC, 2 replies.