You are viewing a plain text version of this content. The canonical link for it is here.
- [Nutch Wiki] Update of "ContributorsGroup" by ChrisMattmann - posted by Apache Wiki <wi...@apache.org> on 2015/07/01 01:46:59 UTC, 2 replies.
- [jira] [Commented] (NUTCH-2038) Naive Bayes classifier based html Parse filter (for filtering outlinks) - posted by "Sebastian Nagel (JIRA)" <ji...@apache.org> on 2015/07/01 01:58:05 UTC, 15 replies.
- [jira] [Created] (NUTCH-2053) Uncessary dependencies included in ivy.xml (post NUTCH-2038) - posted by "Lewis John McGibbney (JIRA)" <ji...@apache.org> on 2015/07/01 05:39:04 UTC, 0 replies.
- [jira] [Commented] (NUTCH-2053) Uncessary dependencies included in ivy.xml (post NUTCH-2038) - posted by "Chris A. Mattmann (JIRA)" <ji...@apache.org> on 2015/07/01 05:53:04 UTC, 1 replies.
- Build failed in Jenkins: Nutch-trunk #3183 - posted by Apache Jenkins Server <je...@builds.apache.org> on 2015/07/01 06:17:43 UTC, 0 replies.
- [GitHub] nutch pull request: NUTCH-2038 - posted by asfgit <gi...@git.apache.org> on 2015/07/01 06:22:17 UTC, 0 replies.
- [jira] [Resolved] (NUTCH-2038) Naive Bayes classifier based html Parse filter (for filtering outlinks) - posted by "Chris A. Mattmann (JIRA)" <ji...@apache.org> on 2015/07/01 06:23:05 UTC, 1 replies.
- [jira] [Comment Edited] (NUTCH-2038) Naive Bayes classifier based html Parse filter (for filtering outlinks) - posted by "Asitang Mishra (JIRA)" <ji...@apache.org> on 2015/07/01 06:24:05 UTC, 3 replies.
- [jira] [Work started] (NUTCH-2052) Enhance index-static to allow configurable delimiters - posted by "Chris A. Mattmann (JIRA)" <ji...@apache.org> on 2015/07/01 06:25:04 UTC, 1 replies.
- [jira] [Assigned] (NUTCH-2052) Enhance index-static to allow configurable delimiters - posted by "Chris A. Mattmann (JIRA)" <ji...@apache.org> on 2015/07/01 06:25:04 UTC, 0 replies.
- [jira] [Reopened] (NUTCH-2052) Enhance index-static to allow configurable delimiters - posted by "Chris A. Mattmann (JIRA)" <ji...@apache.org> on 2015/07/01 06:25:04 UTC, 1 replies.
- [jira] [Commented] (NUTCH-2052) Enhance index-static to allow configurable delimiters - posted by "Chris A. Mattmann (JIRA)" <ji...@apache.org> on 2015/07/01 06:39:04 UTC, 12 replies.
- Build failed in Jenkins: Nutch-trunk #3184 - posted by Apache Jenkins Server <je...@builds.apache.org> on 2015/07/01 06:50:37 UTC, 0 replies.
- [jira] [Reopened] (NUTCH-2038) Naive Bayes classifier based html Parse filter (for filtering outlinks) - posted by "Chris A. Mattmann (JIRA)" <ji...@apache.org> on 2015/07/01 07:24:06 UTC, 0 replies.
- [jira] [Work started] (NUTCH-2038) Naive Bayes classifier based html Parse filter (for filtering outlinks) - posted by "Chris A. Mattmann (JIRA)" <ji...@apache.org> on 2015/07/01 07:24:06 UTC, 0 replies.
- Build failed in Jenkins: Nutch-trunk #3185 - posted by Apache Jenkins Server <je...@builds.apache.org> on 2015/07/01 07:30:34 UTC, 0 replies.
- Jenkins build is back to normal : Nutch-trunk #3186 - posted by Apache Jenkins Server <je...@builds.apache.org> on 2015/07/01 08:32:40 UTC, 0 replies.
- [jira] [Commented] (NUTCH-1684) ParseMeta to be added before fetch schedulers are run - posted by "Markus Jelsma (JIRA)" <ji...@apache.org> on 2015/07/01 08:58:04 UTC, 1 replies.
- [jira] [Resolved] (NUTCH-1692) SegmentReader broken in distributed mode - posted by "Markus Jelsma (JIRA)" <ji...@apache.org> on 2015/07/01 09:02:05 UTC, 0 replies.
- [jira] [Resolved] (NUTCH-1684) ParseMeta to be added before fetch schedulers are run - posted by "Markus Jelsma (JIRA)" <ji...@apache.org> on 2015/07/01 09:02:05 UTC, 0 replies.
- [jira] [Commented] (NUTCH-1980) Jexl expressions for CrawlDbReader - posted by "Markus Jelsma (JIRA)" <ji...@apache.org> on 2015/07/01 09:16:04 UTC, 1 replies.
- [jira] [Updated] (NUTCH-1980) Jexl expressions for CrawlDbReader - posted by "Markus Jelsma (JIRA)" <ji...@apache.org> on 2015/07/01 09:38:04 UTC, 1 replies.
- [jira] [Resolved] (NUTCH-1980) Jexl expressions for CrawlDbReader - posted by "Markus Jelsma (JIRA)" <ji...@apache.org> on 2015/07/01 09:39:05 UTC, 0 replies.
- [jira] [Commented] (NUTCH-1692) SegmentReader broken in distributed mode - posted by "Hudson (JIRA)" <ji...@apache.org> on 2015/07/01 09:53:05 UTC, 0 replies.
- [jira] [Updated] (NUTCH-1838) Host and domain based regex and automaton filtering - posted by "Markus Jelsma (JIRA)" <ji...@apache.org> on 2015/07/01 10:21:04 UTC, 0 replies.
- [jira] [Commented] (NUTCH-1730) Scoring-depth optionally not to increment depth for external hosts - posted by "Markus Jelsma (JIRA)" <ji...@apache.org> on 2015/07/01 10:38:06 UTC, 1 replies.
- [jira] [Updated] (NUTCH-1449) Optionally delete documents skipped by IndexingFilters - posted by "Markus Jelsma (JIRA)" <ji...@apache.org> on 2015/07/01 12:14:04 UTC, 0 replies.
- [jira] [Commented] (NUTCH-1449) Optionally delete documents skipped by IndexingFilters - posted by "Markus Jelsma (JIRA)" <ji...@apache.org> on 2015/07/01 12:16:04 UTC, 0 replies.
- [jira] [Updated] (NUTCH-1940) Port HTTP POST Authentication to 2.X - posted by "Talat UYARER (JIRA)" <ji...@apache.org> on 2015/07/01 14:09:04 UTC, 1 replies.
- [jira] [Commented] (NUTCH-1940) Port HTTP POST Authentication to 2.X - posted by "Lewis John McGibbney (JIRA)" <ji...@apache.org> on 2015/07/01 14:10:04 UTC, 0 replies.
- [jira] [Created] (NUTCH-2054) When Using Form Auth settings can not read response body - posted by "Talat UYARER (JIRA)" <ji...@apache.org> on 2015/07/01 14:12:04 UTC, 0 replies.
- [jira] [Updated] (NUTCH-2054) When Using Form Auth settings can not read response body - posted by "Talat UYARER (JIRA)" <ji...@apache.org> on 2015/07/01 14:31:04 UTC, 0 replies.
- [jira] [Commented] (NUTCH-2049) Upgrade Trunk to Hadoop > 2.6 stable - posted by "Sebastian Nagel (JIRA)" <ji...@apache.org> on 2015/07/01 14:53:04 UTC, 2 replies.
- [jira] [Created] (NUTCH-2055) Random Crawl Delay - posted by "Talat UYARER (JIRA)" <ji...@apache.org> on 2015/07/01 15:53:04 UTC, 0 replies.
- [jira] [Updated] (NUTCH-2055) Random Crawl Delay - posted by "Talat UYARER (JIRA)" <ji...@apache.org> on 2015/07/01 15:56:04 UTC, 2 replies.
- [jira] [Updated] (NUTCH-2056) Move the Mahout and Lucene dependencies to the plugin from the main ivy.xml for the Naive Bayes Parse Filter (NUTCH-2038) - posted by "Asitang Mishra (JIRA)" <ji...@apache.org> on 2015/07/01 17:30:06 UTC, 0 replies.
- [jira] [Created] (NUTCH-2056) Move the Mahout and Lucene dependencies to the plugin from the main ivy.xml for the Naive Bayes Parse Filter (NUTCH-2038) - posted by "Asitang Mishra (JIRA)" <ji...@apache.org> on 2015/07/01 17:30:06 UTC, 0 replies.
- [jira] [Created] (NUTCH-2057) Put all the files produced during training of the model for Naive Bayes classifier, in the Naive Bayed Parse Filter (NUTCH-2038), in a single folder - posted by "Asitang Mishra (JIRA)" <ji...@apache.org> on 2015/07/01 17:33:04 UTC, 0 replies.
- [jira] [Updated] (NUTCH-2057) Put all the files produced during training of the model for Naive Bayes classifier, in the Naive Bayes Parse Filter (NUTCH-2038), in a single folder - posted by "Asitang Mishra (JIRA)" <ji...@apache.org> on 2015/07/01 17:35:05 UTC, 0 replies.
- [jira] [Commented] (NUTCH-2055) Random Crawl Delay - posted by "Sebastian Nagel (JIRA)" <ji...@apache.org> on 2015/07/01 17:42:04 UTC, 1 replies.
- [Nutch Wiki] Update of "AsitangMishra" by AsitangMishra - posted by Apache Wiki <wi...@apache.org> on 2015/07/01 17:44:22 UTC, 3 replies.
- [jira] [Created] (NUTCH-2058) Indexer plugin that allows RegEx replacements on the NutchDocument field values - posted by "Peter Ciuffetti (JIRA)" <ji...@apache.org> on 2015/07/02 22:41:05 UTC, 0 replies.
- [jira] [Commented] (NUTCH-2058) Indexer plugin that allows RegEx replacements on the NutchDocument field values - posted by "Peter Ciuffetti (JIRA)" <ji...@apache.org> on 2015/07/02 23:01:04 UTC, 11 replies.
- [GitHub] nutch pull request: Nutch 2052 - Enhancement to index-static to al... - posted by asfgit <gi...@git.apache.org> on 2015/07/03 18:24:15 UTC, 0 replies.
- [jira] [Resolved] (NUTCH-2052) Enhance index-static to allow configurable delimiters - posted by "Chris A. Mattmann (JIRA)" <ji...@apache.org> on 2015/07/03 18:25:04 UTC, 1 replies.
- Squashing Git Commits - posted by Lewis John Mcgibbney <le...@gmail.com> on 2015/07/03 18:29:20 UTC, 1 replies.
- Build failed in Jenkins: Nutch-trunk #3189 - posted by Apache Jenkins Server <je...@builds.apache.org> on 2015/07/04 10:17:20 UTC, 0 replies.
- [GitHub] nutch pull request: Nutch 2058 - New index-replace plugin that all... - posted by PeterCiuffetti <gi...@git.apache.org> on 2015/07/04 12:36:42 UTC, 5 replies.
- congratulations message for me from GSOC2015 - posted by Cihad Guzel <cg...@gmail.com> on 2015/07/04 14:54:05 UTC, 0 replies.
- GSOC2015- Sitemap crawler roudmap problems - posted by Cihad Guzel <cg...@gmail.com> on 2015/07/04 14:56:53 UTC, 3 replies.
- Build failed in Jenkins: Nutch-trunk #3190 - posted by Apache Jenkins Server <je...@builds.apache.org> on 2015/07/04 21:38:54 UTC, 0 replies.
- [jira] [Created] (NUTCH-2059) protocol-httpclient unit test error on Jenkins - posted by "Peter Ciuffetti (JIRA)" <ji...@apache.org> on 2015/07/04 22:28:04 UTC, 0 replies.
- [jira] [Updated] (NUTCH-2059) protocol-httpclient unit test error on Jenkins - posted by "Peter Ciuffetti (JIRA)" <ji...@apache.org> on 2015/07/04 22:32:04 UTC, 0 replies.
- [jira] [Updated] (NUTCH-2059) protocol-httpclient, protocol-http unit test errors on Jenkins - posted by "Peter Ciuffetti (JIRA)" <ji...@apache.org> on 2015/07/04 22:35:04 UTC, 3 replies.
- [GitHub] nutch pull request: Nutch 2059 - Unit test failures for protocol-h... - posted by PeterCiuffetti <gi...@git.apache.org> on 2015/07/04 23:43:48 UTC, 1 replies.
- [jira] [Commented] (NUTCH-1836) Timeouts in protocol-httpclient when crawling same host with >2 threads NUTCH-1613 is not a complete solution - posted by "ASF GitHub Bot (JIRA)" <ji...@apache.org> on 2015/07/04 23:44:05 UTC, 1 replies.
- [jira] [Assigned] (NUTCH-2059) protocol-httpclient, protocol-http unit test errors on Jenkins - posted by "Chris A. Mattmann (JIRA)" <ji...@apache.org> on 2015/07/05 04:41:04 UTC, 0 replies.
- [jira] [Work started] (NUTCH-2059) protocol-httpclient, protocol-http unit test errors on Jenkins - posted by "Chris A. Mattmann (JIRA)" <ji...@apache.org> on 2015/07/05 04:41:04 UTC, 0 replies.
- [jira] [Resolved] (NUTCH-2059) protocol-httpclient, protocol-http unit test errors on Jenkins - posted by "Chris A. Mattmann (JIRA)" <ji...@apache.org> on 2015/07/05 04:52:05 UTC, 0 replies.
- Jenkins build is back to normal : Nutch-trunk #3191 - posted by Apache Jenkins Server <je...@builds.apache.org> on 2015/07/05 05:07:28 UTC, 0 replies.
- [jira] [Commented] (NUTCH-2059) protocol-httpclient, protocol-http unit test errors on Jenkins - posted by "Hudson (JIRA)" <ji...@apache.org> on 2015/07/05 05:08:04 UTC, 22 replies.
- Build failed in Jenkins: Nutch-trunk #3192 - posted by Apache Jenkins Server <je...@builds.apache.org> on 2015/07/05 06:15:16 UTC, 0 replies.
- Jenkins build is back to normal : Nutch-trunk #3193 - posted by Apache Jenkins Server <je...@builds.apache.org> on 2015/07/05 07:11:50 UTC, 0 replies.
- [jira] [Commented] (NUTCH-1086) Rewrite protocol-httpclient - posted by "Peter Ciuffetti (JIRA)" <ji...@apache.org> on 2015/07/05 15:16:05 UTC, 1 replies.
- [jira] [Reopened] (NUTCH-2059) protocol-httpclient, protocol-http unit test errors on Jenkins - posted by "Peter Ciuffetti (JIRA)" <ji...@apache.org> on 2015/07/05 15:35:04 UTC, 0 replies.
- Nutch and JS/Css rendering - posted by Talat Uyarer <ta...@uyarer.com> on 2015/07/06 11:34:19 UTC, 3 replies.
- Re: [MASSMAIL]RE: Nutch and JS/Css rendering - posted by Jorge Luis Betancourt González <jl...@uci.cu> on 2015/07/06 19:32:18 UTC, 0 replies.
- Build failed in Jenkins: Nutch-trunk #3195 - posted by Apache Jenkins Server <je...@builds.apache.org> on 2015/07/07 06:26:19 UTC, 0 replies.
- [jira] [Created] (NUTCH-2060) dedup is removing entries with status db_gone - posted by "Steven Hayles (JIRA)" <ji...@apache.org> on 2015/07/07 09:56:05 UTC, 0 replies.
- [jira] [Reopened] (NUTCH-1319) HostNormalizer - posted by "Markus Jelsma (JIRA)" <ji...@apache.org> on 2015/07/07 13:54:04 UTC, 1 replies.
- [jira] [Updated] (NUTCH-1319) HostNormalizer - posted by "Markus Jelsma (JIRA)" <ji...@apache.org> on 2015/07/07 13:54:04 UTC, 1 replies.
- [jira] [Closed] (NUTCH-1319) HostNormalizer - posted by "Markus Jelsma (JIRA)" <ji...@apache.org> on 2015/07/07 13:54:05 UTC, 1 replies.
- Jenkins build is back to normal : Nutch-trunk #3196 - posted by Apache Jenkins Server <je...@builds.apache.org> on 2015/07/08 06:10:31 UTC, 0 replies.
- Build failed in Jenkins: Nutch-trunk #3197 - posted by Apache Jenkins Server <je...@builds.apache.org> on 2015/07/09 06:16:43 UTC, 0 replies.
- [jira] [Assigned] (NUTCH-1824) protocol-http using proxy not working with https sites - posted by "Lewis John McGibbney (JIRA)" <ji...@apache.org> on 2015/07/10 00:07:05 UTC, 0 replies.
- Jenkins build is back to normal : Nutch-trunk #3198 - posted by Apache Jenkins Server <je...@builds.apache.org> on 2015/07/10 06:14:42 UTC, 0 replies.
- Build failed in Jenkins: Nutch-trunk #3199 - posted by Apache Jenkins Server <je...@builds.apache.org> on 2015/07/11 06:13:36 UTC, 0 replies.
- Jenkins build is back to normal : Nutch-trunk #3200 - posted by Apache Jenkins Server <je...@builds.apache.org> on 2015/07/12 06:07:57 UTC, 0 replies.
- [jira] [Commented] (NUTCH-2060) dedup is removing entries with status db_gone - posted by "Ashish Nerkar (JIRA)" <ji...@apache.org> on 2015/07/12 18:19:05 UTC, 3 replies.
- Build failed in Jenkins: Nutch-trunk #3203 - posted by Apache Jenkins Server <je...@builds.apache.org> on 2015/07/13 23:19:55 UTC, 0 replies.
- Build failed in Jenkins: Nutch-trunk #3204 - posted by Apache Jenkins Server <je...@builds.apache.org> on 2015/07/13 23:43:58 UTC, 0 replies.
- [jira] [Comment Edited] (NUTCH-2059) protocol-httpclient, protocol-http unit test errors on Jenkins - posted by "Peter Ciuffetti (JIRA)" <ji...@apache.org> on 2015/07/13 23:54:05 UTC, 0 replies.
- Build failed in Jenkins: Nutch-trunk #3205 - posted by Apache Jenkins Server <je...@builds.apache.org> on 2015/07/14 02:09:53 UTC, 0 replies.
- Build failed in Jenkins: Nutch-trunk #3206 - posted by Apache Jenkins Server <je...@builds.apache.org> on 2015/07/14 04:40:48 UTC, 0 replies.
- Build failed in Jenkins: Nutch-trunk #3207 - posted by Apache Jenkins Server <je...@builds.apache.org> on 2015/07/14 04:50:59 UTC, 0 replies.
- Build failed in Jenkins: Nutch-trunk #3208 - posted by Apache Jenkins Server <je...@builds.apache.org> on 2015/07/14 06:10:55 UTC, 0 replies.
- Jenkins build is back to normal : Nutch-trunk #3209 - posted by Apache Jenkins Server <je...@builds.apache.org> on 2015/07/14 06:54:21 UTC, 0 replies.
- Jenkins now publishes Nutch test results (added test-plugins) - posted by "Mattmann, Chris A (3980)" <ch...@jpl.nasa.gov> on 2015/07/14 07:00:37 UTC, 0 replies.
- Build failed in Jenkins: Nutch-trunk #3211 - posted by Apache Jenkins Server <je...@builds.apache.org> on 2015/07/14 07:19:17 UTC, 0 replies.
- Jenkins build is back to normal : Nutch-trunk #3212 - posted by Apache Jenkins Server <je...@builds.apache.org> on 2015/07/14 08:09:42 UTC, 0 replies.
- [jira] [Updated] (NUTCH-2048) parse-tika: fix dependencies in plugin.xml - posted by "Sebastian Nagel (JIRA)" <ji...@apache.org> on 2015/07/14 22:14:04 UTC, 3 replies.
- [jira] [Work started] (NUTCH-2058) Indexer plugin that allows RegEx replacements on the NutchDocument field values - posted by "Chris A. Mattmann (JIRA)" <ji...@apache.org> on 2015/07/16 00:08:05 UTC, 0 replies.
- [jira] [Assigned] (NUTCH-2058) Indexer plugin that allows RegEx replacements on the NutchDocument field values - posted by "Chris A. Mattmann (JIRA)" <ji...@apache.org> on 2015/07/16 00:08:05 UTC, 0 replies.
- [jira] [Updated] (NUTCH-2058) Indexer plugin that allows RegEx replacements on the NutchDocument field values - posted by "Chris A. Mattmann (JIRA)" <ji...@apache.org> on 2015/07/16 00:08:06 UTC, 1 replies.
- [jira] [Resolved] (NUTCH-2058) Indexer plugin that allows RegEx replacements on the NutchDocument field values - posted by "Chris A. Mattmann (JIRA)" <ji...@apache.org> on 2015/07/16 03:00:07 UTC, 0 replies.
- [Nutch Wiki] Trivial Update of "IndexStructure" by PeterCiuffetti - posted by Apache Wiki <wi...@apache.org> on 2015/07/16 17:03:08 UTC, 0 replies.
- [Nutch Wiki] Update of "IndexStructure" by PeterCiuffetti - posted by Apache Wiki <wi...@apache.org> on 2015/07/16 17:14:25 UTC, 0 replies.
- [Nutch Wiki] Update of "IndexReplace" by PeterCiuffetti - posted by Apache Wiki <wi...@apache.org> on 2015/07/16 17:58:37 UTC, 0 replies.
- [jira] [Created] (NUTCH-2061) Make core upgrades to all org.apache.httpcomponents dependencies - posted by "Lewis John McGibbney (JIRA)" <ji...@apache.org> on 2015/07/16 23:49:04 UTC, 0 replies.
- [jira] [Created] (NUTCH-2062) Add Plugin for interacting with Selenium WebDriver - posted by "Michael Joyce (JIRA)" <ji...@apache.org> on 2015/07/20 17:58:04 UTC, 0 replies.
- [jira] [Commented] (NUTCH-2062) Add Plugin for interacting with Selenium WebDriver - posted by "Michael Joyce (JIRA)" <ji...@apache.org> on 2015/07/20 17:58:05 UTC, 8 replies.
- [jira] [Closed] (NUTCH-2044) Support for an expanded HttpHeaders list - posted by "Soren Scott (JIRA)" <ji...@apache.org> on 2015/07/20 18:31:04 UTC, 0 replies.
- [jira] [Created] (NUTCH-2063) Add -mimeStats flag to FileDumper tool - posted by "Lewis John McGibbney (JIRA)" <ji...@apache.org> on 2015/07/20 22:30:04 UTC, 0 replies.
- [jira] [Created] (NUTCH-2064) URLNormalizer basic to properly encode non-ASCII characters - posted by "Markus Jelsma (JIRA)" <ji...@apache.org> on 2015/07/21 16:12:04 UTC, 0 replies.
- [jira] [Updated] (NUTCH-2064) URLNormalizer basic to properly encode non-ASCII characters - posted by "Markus Jelsma (JIRA)" <ji...@apache.org> on 2015/07/21 16:13:05 UTC, 5 replies.
- [jira] [Created] (NUTCH-2065) Domain URL filter to support protocols - posted by "Markus Jelsma (JIRA)" <ji...@apache.org> on 2015/07/21 17:09:04 UTC, 0 replies.
- [jira] [Updated] (NUTCH-2065) Domain URL filter to support protocols - posted by "Markus Jelsma (JIRA)" <ji...@apache.org> on 2015/07/21 17:10:05 UTC, 0 replies.
- [Nutch Wiki] Trivial Update of "IndexReplace" by PeterCiuffetti - posted by Apache Wiki <wi...@apache.org> on 2015/07/21 17:22:13 UTC, 0 replies.
- [Nutch Wiki] Update of "NutchPropertiesCompleteList" by PeterCiuffetti - posted by Apache Wiki <wi...@apache.org> on 2015/07/21 17:36:34 UTC, 0 replies.
- [GitHub] nutch pull request: NUTCH-2062 - Interactive Selenium Plugin - posted by MJJoyce <gi...@git.apache.org> on 2015/07/21 18:37:19 UTC, 0 replies.
- [jira] [Updated] (NUTCH-2063) Add -mimeStats flag to FileDumper tool - posted by "Michael Joyce (JIRA)" <ji...@apache.org> on 2015/07/21 21:53:04 UTC, 2 replies.
- [jira] [Commented] (NUTCH-2063) Add -mimeStats flag to FileDumper tool - posted by "Michael Joyce (JIRA)" <ji...@apache.org> on 2015/07/21 21:53:05 UTC, 1 replies.
- [jira] [Commented] (NUTCH-2064) URLNormalizer basic to properly encode non-ASCII characters - posted by "Sebastian Nagel (JIRA)" <ji...@apache.org> on 2015/07/21 23:13:04 UTC, 3 replies.
- [jira] [Updated] (NUTCH-2021) Use protocol-selenium to Capture Screenshots of the Page as it is Fetched - posted by "Lewis John McGibbney (JIRA)" <ji...@apache.org> on 2015/07/22 06:01:04 UTC, 0 replies.
- [jira] [Resolved] (NUTCH-2021) Use protocol-selenium to Capture Screenshots of the Page as it is Fetched - posted by "Lewis John McGibbney (JIRA)" <ji...@apache.org> on 2015/07/22 06:11:05 UTC, 0 replies.
- [jira] [Commented] (NUTCH-2021) Use protocol-selenium to Capture Screenshots of the Page as it is Fetched - posted by "Hudson (JIRA)" <ji...@apache.org> on 2015/07/22 07:12:04 UTC, 1 replies.
- [jira] [Resolved] (NUTCH-2063) Add -mimeStats flag to FileDumper tool - posted by "Lewis John McGibbney (JIRA)" <ji...@apache.org> on 2015/07/22 14:52:04 UTC, 0 replies.
- [jira] [Updated] (NUTCH-2062) Add Plugin for interacting with Selenium WebDriver - posted by "Lewis John McGibbney (JIRA)" <ji...@apache.org> on 2015/07/22 16:03:04 UTC, 2 replies.
- [jira] [Updated] (NUTCH-2004) ParseChecker does not handle redirects - posted by "Michael Joyce (JIRA)" <ji...@apache.org> on 2015/07/22 19:01:05 UTC, 0 replies.
- [jira] [Created] (NUTCH-2066) Allow user to specify crawldb and segment db in the Generate JOb REST endpoint - posted by "Sujen Shah (JIRA)" <ji...@apache.org> on 2015/07/23 19:49:04 UTC, 0 replies.
- [jira] [Updated] (NUTCH-2066) Allow user to specify crawldb and segment db in the Generate JOb REST endpoint - posted by "Sujen Shah (JIRA)" <ji...@apache.org> on 2015/07/23 19:49:05 UTC, 0 replies.
- [GitHub] nutch pull request: Fix for NUTCH-2066 contributed by Sujen Shah - posted by sujen1412 <gi...@git.apache.org> on 2015/07/23 19:52:04 UTC, 0 replies.
- [jira] [Commented] (NUTCH-2066) Allow user to specify crawldb and segment db in the Generate JOb REST endpoint - posted by "ASF GitHub Bot (JIRA)" <ji...@apache.org> on 2015/07/23 19:52:05 UTC, 0 replies.
- [jira] [Commented] (NUTCH-2048) parse-tika: fix dependencies in plugin.xml - posted by "Sebastian Nagel (JIRA)" <ji...@apache.org> on 2015/07/23 21:32:04 UTC, 5 replies.
- [jira] [Updated] (NUTCH-2042) parse-html increase chunk size used to detect charset - posted by "Sebastian Nagel (JIRA)" <ji...@apache.org> on 2015/07/23 22:47:05 UTC, 1 replies.
- [jira] [Assigned] (NUTCH-2042) parse-html increase chunk size used to detect charset - posted by "Sebastian Nagel (JIRA)" <ji...@apache.org> on 2015/07/23 22:47:06 UTC, 0 replies.
- [jira] [Created] (NUTCH-2067) HttpFormAuthentication unable to decode login page when server responds with GZIP encoding - posted by "patrick peck (JIRA)" <ji...@apache.org> on 2015/07/24 11:43:05 UTC, 0 replies.
- [jira] [Assigned] (NUTCH-1517) CloudSearch indexer - posted by "Julien Nioche (JIRA)" <ji...@apache.org> on 2015/07/24 12:42:05 UTC, 0 replies.
- [jira] [Assigned] (NUTCH-1936) GSoC 2015 - Move Nutch to Hadoop 2.X - posted by "Lewis John McGibbney (JIRA)" <ji...@apache.org> on 2015/07/25 00:54:06 UTC, 0 replies.
- [jira] [Updated] (NUTCH-1936) GSoC 2015 - Move Nutch to Hadoop 2.X - posted by "Lewis John McGibbney (JIRA)" <ji...@apache.org> on 2015/07/25 00:57:05 UTC, 0 replies.
- [jira] [Commented] (NUTCH-1936) GSoC 2015 - Move Nutch to Hadoop 2.X - posted by "Michael Joyce (JIRA)" <ji...@apache.org> on 2015/07/25 01:03:04 UTC, 1 replies.
- [jira] [Assigned] (NUTCH-2049) Upgrade Trunk to Hadoop > 2.6 stable - posted by "Lewis John McGibbney (JIRA)" <ji...@apache.org> on 2015/07/25 07:03:04 UTC, 0 replies.
- [jira] [Updated] (NUTCH-2049) Upgrade Trunk to Hadoop > 2.6 stable - posted by "Lewis John McGibbney (JIRA)" <ji...@apache.org> on 2015/07/25 07:10:04 UTC, 0 replies.
- [jira] [Closed] (NUTCH-1936) GSoC 2015 - Move Nutch to Hadoop 2.X - posted by "Lewis John McGibbney (JIRA)" <ji...@apache.org> on 2015/07/25 07:11:04 UTC, 0 replies.
- [jira] [Updated] (NUTCH-2049) Upgrade Trunk to Hadoop > 2.4 stable - posted by "Lewis John McGibbney (JIRA)" <ji...@apache.org> on 2015/07/25 07:31:05 UTC, 0 replies.
- [jira] [Commented] (NUTCH-2049) Upgrade Trunk to Hadoop > 2.4 stable - posted by "Lewis John McGibbney (JIRA)" <ji...@apache.org> on 2015/07/25 07:33:04 UTC, 0 replies.
- [jira] [Commented] (NUTCH-1339) Default URL normalization rules to remove page anchors completely - posted by "Markus Jelsma (JIRA)" <ji...@apache.org> on 2015/07/27 10:56:04 UTC, 0 replies.
- [jira] [Created] (NUTCH-2068) Allow subcollection overrides via metadata - posted by "Markus Jelsma (JIRA)" <ji...@apache.org> on 2015/07/27 11:48:04 UTC, 0 replies.
- [jira] [Updated] (NUTCH-2068) Allow subcollection overrides via metadata - posted by "Markus Jelsma (JIRA)" <ji...@apache.org> on 2015/07/27 12:09:04 UTC, 0 replies.
- [jira] [Updated] (NUTCH-1932) Automatically remove orphaned pages - posted by "Markus Jelsma (JIRA)" <ji...@apache.org> on 2015/07/27 12:45:04 UTC, 1 replies.
- [jira] [Created] (NUTCH-2069) Ignore external links based on domain - posted by "Julien Nioche (JIRA)" <ji...@apache.org> on 2015/07/29 13:13:04 UTC, 0 replies.
- [jira] [Updated] (NUTCH-2069) Ignore external links based on domain - posted by "Julien Nioche (JIRA)" <ji...@apache.org> on 2015/07/29 13:19:04 UTC, 1 replies.
- [jira] [Assigned] (NUTCH-2062) Add Plugin for interacting with Selenium WebDriver - posted by "Chris A. Mattmann (JIRA)" <ji...@apache.org> on 2015/07/29 19:14:05 UTC, 0 replies.
- [jira] [Work started] (NUTCH-2062) Add Plugin for interacting with Selenium WebDriver - posted by "Chris A. Mattmann (JIRA)" <ji...@apache.org> on 2015/07/29 19:14:05 UTC, 0 replies.
- [jira] [Comment Edited] (NUTCH-2062) Add Plugin for interacting with Selenium WebDriver - posted by "Michael Joyce (JIRA)" <ji...@apache.org> on 2015/07/29 19:51:05 UTC, 0 replies.
- [jira] [Commented] (NUTCH-2069) Ignore external links based on domain - posted by "Markus Jelsma (JIRA)" <ji...@apache.org> on 2015/07/29 19:57:04 UTC, 3 replies.
- [jira] [Created] (NUTCH-2070) Allow user to specify segment to Fetch via the REST API - posted by "Sujen Shah (JIRA)" <ji...@apache.org> on 2015/07/30 00:45:04 UTC, 0 replies.
- [jira] [Updated] (NUTCH-2070) Allow user to specify segment to Fetch via the REST API - posted by "Sujen Shah (JIRA)" <ji...@apache.org> on 2015/07/30 00:46:05 UTC, 0 replies.
- [jira] [Created] (NUTCH-2071) A parser failure on a single document may fail crawling job - posted by "Arkadi Kosmynin (JIRA)" <ji...@apache.org> on 2015/07/30 08:12:04 UTC, 0 replies.
- [jira] [Updated] (NUTCH-2071) A parser failure on a single document may fail crawling job - posted by "Arkadi Kosmynin (JIRA)" <ji...@apache.org> on 2015/07/30 08:35:04 UTC, 1 replies.
- [jira] [Updated] (NUTCH-2072) Deflate encoding support is broken when http.content.limit is set to -1 - posted by "Tanguy Moal (JIRA)" <ji...@apache.org> on 2015/07/30 11:23:04 UTC, 0 replies.
- [jira] [Created] (NUTCH-2072) Deflate encoding support is broken when http.content.limit is set to -1 - posted by "Tanguy Moal (JIRA)" <ji...@apache.org> on 2015/07/30 11:23:04 UTC, 0 replies.
- [GitHub] nutch pull request: Fix for NUTCH-2072 - posted by tuxnco <gi...@git.apache.org> on 2015/07/30 11:26:06 UTC, 0 replies.
- [jira] [Commented] (NUTCH-2072) Deflate encoding support is broken when http.content.limit is set to -1 - posted by "ASF GitHub Bot (JIRA)" <ji...@apache.org> on 2015/07/30 11:27:04 UTC, 1 replies.
- [jira] [Updated] (NUTCH-1785) Ability to index raw content - posted by "Lewis John McGibbney (JIRA)" <ji...@apache.org> on 2015/07/30 22:21:05 UTC, 0 replies.
- [jira] [Comment Edited] (NUTCH-1785) Ability to index raw content - posted by "Lewis John McGibbney (JIRA)" <ji...@apache.org> on 2015/07/30 22:22:05 UTC, 0 replies.
- [jira] [Commented] (NUTCH-1785) Ability to index raw content - posted by "Thad Guidry (JIRA)" <ji...@apache.org> on 2015/07/30 22:40:05 UTC, 2 replies.
- [jira] [Resolved] (NUTCH-1785) Ability to index raw content - posted by "Lewis John McGibbney (JIRA)" <ji...@apache.org> on 2015/07/30 23:30:05 UTC, 0 replies.