You are viewing a plain text version of this content. The canonical link for it is here.
- [RESULT] was [VOTE] Release Apache Nutch 1.17 RC#1 - posted by Sebastian Nagel <wa...@googlemail.com> on 2020/07/01 09:22:39 UTC, 0 replies.
- [ANNOUNCE] Apache Nutch 1.17 Release - posted by Sebastian Nagel <sn...@apache.org> on 2020/07/02 14:41:50 UTC, 1 replies.
- [jira] [Created] (NUTCH-2795) CrawlDbReader: compress CrawlDb dumps if configured - posted by "Sebastian Nagel (Jira)" <ji...@apache.org> on 2020/07/06 11:41:00 UTC, 0 replies.
- [jira] [Created] (NUTCH-2796) Upgrade to crawler-commons 1.1 - posted by "Sebastian Nagel (Jira)" <ji...@apache.org> on 2020/07/06 11:54:00 UTC, 0 replies.
- [jira] [Assigned] (NUTCH-2796) Upgrade to crawler-commons 1.1 - posted by "Sebastian Nagel (Jira)" <ji...@apache.org> on 2020/07/06 11:57:00 UTC, 0 replies.
- [GitHub] [nutch] sebastian-nagel opened a new pull request #535: [NUTCH-2796] [NUTCH-2730] Update crawler-commons 1.1, SitemapProcessor to treat sitemap URLs as Set instead of List - posted by GitBox <gi...@apache.org> on 2020/07/06 12:16:22 UTC, 0 replies.
- [jira] [Commented] (NUTCH-2796) Upgrade to crawler-commons 1.1 - posted by "ASF GitHub Bot (Jira)" <ji...@apache.org> on 2020/07/06 12:17:00 UTC, 2 replies.
- [jira] [Created] (NUTCH-2797) Update Miredot license for REST API documentation creation - posted by "Sebastian Nagel (Jira)" <ji...@apache.org> on 2020/07/06 12:46:00 UTC, 0 replies.
- [jira] [Created] (NUTCH-2798) Nutch v2.4 Not Able to crawl after javax.faces.viewstate - posted by "Mihir Sharma (Jira)" <ji...@apache.org> on 2020/07/06 14:54:00 UTC, 0 replies.
- Re: Regarding the branch 2.x - posted by Sebastian Nagel <wa...@googlemail.com> on 2020/07/07 14:12:26 UTC, 0 replies.
- [jira] [Commented] (NUTCH-2292) Mavenize the build for nutch-core and nutch-plugins - posted by "Shashanka Balakuntala Srinivasa (Jira)" <ji...@apache.org> on 2020/07/07 14:29:00 UTC, 1 replies.
- [jira] [Assigned] (NUTCH-2782) protocol-http / lib-http: support TLSv1.3 - posted by "Shashanka Balakuntala Srinivasa (Jira)" <ji...@apache.org> on 2020/07/07 15:17:00 UTC, 0 replies.
- [jira] [Created] (NUTCH-2799) Add .asf.yaml file - posted by "Sebastian Nagel (Jira)" <ji...@apache.org> on 2020/07/08 14:30:00 UTC, 0 replies.
- [GitHub] [nutch] sebastian-nagel opened a new pull request #536: [NUTCH-2799] Add .asf.yaml file - posted by GitBox <gi...@apache.org> on 2020/07/08 14:35:57 UTC, 0 replies.
- [jira] [Commented] (NUTCH-2799) Add .asf.yaml file - posted by "ASF GitHub Bot (Jira)" <ji...@apache.org> on 2020/07/08 14:36:00 UTC, 0 replies.
- [jira] [Commented] (NUTCH-2798) Nutch v2.4 Not Able to crawl after javax.faces.viewstate - posted by "Sebastian Nagel (Jira)" <ji...@apache.org> on 2020/07/08 16:21:00 UTC, 9 replies.
- [jira] [Created] (NUTCH-2800) Outdated information in documentation about catch all user agent - posted by "Jay (Jira)" <ji...@apache.org> on 2020/07/09 02:26:00 UTC, 0 replies.
- [jira] [Updated] (NUTCH-2800) Outdated information in documentation about catch all user agent - posted by "Jay (Jira)" <ji...@apache.org> on 2020/07/09 03:03:00 UTC, 0 replies.
- [jira] [Updated] (NUTCH-2798) Nutch v2.4 Not Able to crawl after javax.faces.viewstate - posted by "Mihir Sharma (Jira)" <ji...@apache.org> on 2020/07/09 14:14:00 UTC, 3 replies.
- [jira] [Issue Comment Deleted] (NUTCH-2798) Nutch v2.4 Not Able to crawl after javax.faces.viewstate - posted by "Mihir Sharma (Jira)" <ji...@apache.org> on 2020/07/09 15:31:00 UTC, 0 replies.
- [jira] [Created] (NUTCH-2801) RobotsRulesParser command-line checker to use http.robots.agents as fall-back - posted by "Sebastian Nagel (Jira)" <ji...@apache.org> on 2020/07/10 13:14:00 UTC, 0 replies.
- [GitHub] [nutch] sebastian-nagel opened a new pull request #537: [NUTCH-2801] RobotsRulesParser command-line checker to use http.robots.agents as fall-back - posted by GitBox <gi...@apache.org> on 2020/07/10 13:26:46 UTC, 0 replies.
- [jira] [Commented] (NUTCH-2801) RobotsRulesParser command-line checker to use http.robots.agents as fall-back - posted by "ASF GitHub Bot (Jira)" <ji...@apache.org> on 2020/07/10 13:27:00 UTC, 2 replies.
- [GitHub] [nutch] balashashanka commented on a change in pull request #537: [NUTCH-2801] RobotsRulesParser command-line checker to use http.robots.agents as fall-back - posted by GitBox <gi...@apache.org> on 2020/07/10 13:53:56 UTC, 0 replies.
- [GitHub] [nutch] sebastian-nagel commented on a change in pull request #537: [NUTCH-2801] RobotsRulesParser command-line checker to use http.robots.agents as fall-back - posted by GitBox <gi...@apache.org> on 2020/07/10 14:06:34 UTC, 0 replies.
- [jira] [Created] (NUTCH-2802) Replace blacklist/whitelist by more inclusive and precise terminology - posted by "Sebastian Nagel (Jira)" <ji...@apache.org> on 2020/07/10 17:34:00 UTC, 0 replies.
- [jira] [Assigned] (NUTCH-2802) Replace blacklist/whitelist by more inclusive and precise terminology - posted by "Lewis John McGibbney (Jira)" <ji...@apache.org> on 2020/07/10 17:37:00 UTC, 0 replies.
- [jira] [Commented] (NUTCH-2802) Replace blacklist/whitelist by more inclusive and precise terminology - posted by "Lewis John McGibbney (Jira)" <ji...@apache.org> on 2020/07/10 17:38:00 UTC, 1 replies.
- [GitHub] [nutch] balashashanka opened a new pull request #538: NUTCH-2782: protocol-http / lib-http: support TLSv1.3 - posted by GitBox <gi...@apache.org> on 2020/07/10 17:45:18 UTC, 0 replies.
- [jira] [Commented] (NUTCH-2782) protocol-http / lib-http: support TLSv1.3 - posted by "ASF GitHub Bot (Jira)" <ji...@apache.org> on 2020/07/10 17:46:00 UTC, 2 replies.
- [jira] [Created] (NUTCH-2803) Rename property http.robot.rules.whitelist - posted by "Sebastian Nagel (Jira)" <ji...@apache.org> on 2020/07/10 17:46:00 UTC, 0 replies.
- [jira] [Created] (NUTCH-2804) Rename blacklist/whitelist in configuration of subcollection plugin - posted by "Sebastian Nagel (Jira)" <ji...@apache.org> on 2020/07/10 17:58:00 UTC, 0 replies.
- [jira] [Created] (NUTCH-2805) Rename plugin urlfilter-domainblacklist - posted by "Sebastian Nagel (Jira)" <ji...@apache.org> on 2020/07/10 18:07:00 UTC, 0 replies.
- [jira] [Assigned] (NUTCH-2803) Rename property http.robot.rules.whitelist - posted by "Lewis John McGibbney (Jira)" <ji...@apache.org> on 2020/07/10 18:14:02 UTC, 0 replies.
- [GitHub] [nutch] lewismc opened a new pull request #539: NUTCH-2803 Rename property http.robot.rules.whitelist - posted by GitBox <gi...@apache.org> on 2020/07/10 18:38:01 UTC, 0 replies.
- [jira] [Commented] (NUTCH-2803) Rename property http.robot.rules.whitelist - posted by "ASF GitHub Bot (Jira)" <ji...@apache.org> on 2020/07/10 18:39:00 UTC, 1 replies.
- [jira] [Commented] (NUTCH-2805) Rename plugin urlfilter-domainblacklist - posted by "Shashanka Balakuntala Srinivasa (Jira)" <ji...@apache.org> on 2020/07/10 19:19:00 UTC, 6 replies.
- [jira] [Assigned] (NUTCH-2805) Rename plugin urlfilter-domainblacklist - posted by "Shashanka Balakuntala Srinivasa (Jira)" <ji...@apache.org> on 2020/07/10 19:19:00 UTC, 0 replies.
- [jira] [Created] (NUTCH-2806) Nutch can't parse links - posted by "lina dziri (Jira)" <ji...@apache.org> on 2020/07/10 22:10:00 UTC, 0 replies.
- [jira] [Commented] (NUTCH-2806) Nutch can't parse links - posted by "Jorge Luis Betancourt Gonzalez (Jira)" <ji...@apache.org> on 2020/07/10 22:58:00 UTC, 1 replies.
- [GitHub] [nutch] balashashanka opened a new pull request #540: NUTCH-2805: Rename plugin urlfilter-domainblacklist - posted by GitBox <gi...@apache.org> on 2020/07/11 07:38:38 UTC, 0 replies.
- [jira] [Created] (NUTCH-2807) SitemapProcessor to warn that ignoring robotst.xt affects detection of sitemaps - posted by "Sebastian Nagel (Jira)" <ji...@apache.org> on 2020/07/11 12:05:00 UTC, 0 replies.
- [jira] [Created] (NUTCH-2808) Document side effects of ignoring robots.txt - posted by "Sebastian Nagel (Jira)" <ji...@apache.org> on 2020/07/11 12:06:00 UTC, 0 replies.
- [GitHub] [nutch] sebastian-nagel commented on a change in pull request #539: NUTCH-2803 Rename property http.robot.rules.whitelist - posted by GitBox <gi...@apache.org> on 2020/07/11 12:07:36 UTC, 0 replies.
- [GitHub] [nutch] sebastian-nagel merged pull request #535: [NUTCH-2796] [NUTCH-2730] Update crawler-commons 1.1, SitemapProcessor to treat sitemap URLs as Set instead of List - posted by GitBox <gi...@apache.org> on 2020/07/14 10:48:42 UTC, 0 replies.
- [jira] [Resolved] (NUTCH-2796) Upgrade to crawler-commons 1.1 - posted by "Sebastian Nagel (Jira)" <ji...@apache.org> on 2020/07/14 10:50:00 UTC, 0 replies.
- [jira] [Resolved] (NUTCH-2730) SitemapProcessor to treat sitemap URLs as Set instead of List - posted by "Sebastian Nagel (Jira)" <ji...@apache.org> on 2020/07/14 10:50:00 UTC, 0 replies.
- [GitHub] [nutch] sebastian-nagel merged pull request #538: NUTCH-2782: protocol-http / lib-http: support TLSv1.3 - posted by GitBox <gi...@apache.org> on 2020/07/14 10:52:05 UTC, 0 replies.
- [jira] [Resolved] (NUTCH-2782) protocol-http / lib-http: support TLSv1.3 - posted by "Sebastian Nagel (Jira)" <ji...@apache.org> on 2020/07/14 10:53:00 UTC, 0 replies.
- [GitHub] [nutch] sebastian-nagel commented on a change in pull request #540: NUTCH-2805: Rename plugin urlfilter-domainblacklist - posted by GitBox <gi...@apache.org> on 2020/07/14 11:07:06 UTC, 0 replies.
- [jira] [Commented] (NUTCH-2730) SitemapProcessor to treat sitemap URLs as Set instead of List - posted by "Hudson (Jira)" <ji...@apache.org> on 2020/07/14 12:01:00 UTC, 0 replies.
- [GitHub] [nutch] balashashanka commented on a change in pull request #540: NUTCH-2805: Rename plugin urlfilter-domainblacklist - posted by GitBox <gi...@apache.org> on 2020/07/14 15:09:22 UTC, 0 replies.
- [GitHub] [nutch] balashashanka commented on pull request #439: NUTCH-1870 XSL parse filter - posted by GitBox <gi...@apache.org> on 2020/07/14 15:38:53 UTC, 0 replies.
- [jira] [Commented] (NUTCH-1870) Generic xsl parser plugin - posted by "ASF GitHub Bot (Jira)" <ji...@apache.org> on 2020/07/14 15:39:00 UTC, 1 replies.
- [GitHub] [nutch] sebastian-nagel commented on pull request #439: NUTCH-1870 XSL parse filter - posted by GitBox <gi...@apache.org> on 2020/07/14 16:49:57 UTC, 0 replies.
- [jira] [Closed] (NUTCH-2798) Nutch v2.4 Not Able to crawl after javax.faces.viewstate - posted by "Mihir Sharma (Jira)" <ji...@apache.org> on 2020/07/15 08:40:00 UTC, 0 replies.
- [jira] [Updated] (NUTCH-2290) Update licenses of bundled libraries - posted by "Sebastian Nagel (Jira)" <ji...@apache.org> on 2020/07/15 10:12:00 UTC, 2 replies.
- [jira] [Updated] (NUTCH-2809) Upgrade any23 plugin dependency - posted by "Sebastian Nagel (Jira)" <ji...@apache.org> on 2020/07/15 10:22:00 UTC, 1 replies.
- [jira] [Created] (NUTCH-2809) Upgrade any23 plugin dependency - posted by "Sebastian Nagel (Jira)" <ji...@apache.org> on 2020/07/15 10:22:00 UTC, 0 replies.
- [jira] [Assigned] (NUTCH-2809) Upgrade any23 plugin dependency - posted by "Shashanka Balakuntala Srinivasa (Jira)" <ji...@apache.org> on 2020/07/15 11:27:00 UTC, 0 replies.
- [jira] [Commented] (NUTCH-2809) Upgrade any23 plugin dependency - posted by "Shashanka Balakuntala Srinivasa (Jira)" <ji...@apache.org> on 2020/07/17 21:07:00 UTC, 3 replies.
- [GitHub] [nutch] balashashanka opened a new pull request #541: NUTCH-2809: Upgrade any23 plugin dependency - posted by GitBox <gi...@apache.org> on 2020/07/18 07:49:28 UTC, 0 replies.
- [GitHub] [nutch] lewismc commented on pull request #541: NUTCH-2809: Upgrade any23 plugin dependency - posted by GitBox <gi...@apache.org> on 2020/07/18 22:49:41 UTC, 1 replies.
- [GitHub] [nutch] dhkdn9192 commented on pull request #519: NUTCH-2785 FreeGenerator: command-line option to define number of generated fetch lists - posted by GitBox <gi...@apache.org> on 2020/07/27 05:28:08 UTC, 0 replies.
- [jira] [Commented] (NUTCH-2785) FreeGenerator: command-line option to define number of generated fetch lists - posted by "ASF GitHub Bot (Jira)" <ji...@apache.org> on 2020/07/27 05:29:00 UTC, 1 replies.
- [jira] [Updated] (NUTCH-2806) Nutch can't parse links - posted by "Sebastian Nagel (Jira)" <ji...@apache.org> on 2020/07/27 09:08:00 UTC, 0 replies.
- [jira] [Updated] (NUTCH-2810) FreeGenerator to actually apply configured number of fetch lists - posted by "Sebastian Nagel (Jira)" <ji...@apache.org> on 2020/07/27 09:50:00 UTC, 0 replies.
- [jira] [Created] (NUTCH-2810) FreeGenerator to actually apply - posted by "Sebastian Nagel (Jira)" <ji...@apache.org> on 2020/07/27 09:50:00 UTC, 0 replies.
- [GitHub] [nutch] sebastian-nagel commented on pull request #519: NUTCH-2785 FreeGenerator: command-line option to define number of generated fetch lists - posted by GitBox <gi...@apache.org> on 2020/07/27 09:55:14 UTC, 0 replies.
- [GitHub] [nutch] sebastian-nagel opened a new pull request #542: NUTCH-2810 FreeGenerator to actually apply configured number of fetch lists - posted by GitBox <gi...@apache.org> on 2020/07/27 10:06:42 UTC, 0 replies.
- [jira] [Commented] (NUTCH-2810) FreeGenerator to actually apply configured number of fetch lists - posted by "ASF GitHub Bot (Jira)" <ji...@apache.org> on 2020/07/27 10:07:00 UTC, 0 replies.
- [ANNOUNCE] New Nutch committer and PMC - Shashanka Balakuntala Srinivasa - posted by Sebastian Nagel <sn...@apache.org> on 2020/07/28 10:54:33 UTC, 0 replies.
- [jira] [Commented] (NUTCH-2669) Reliable solution for javax.ws packaging.type - posted by "Lewis John McGibbney (Jira)" <ji...@apache.org> on 2020/07/29 05:22:00 UTC, 0 replies.
- [jira] [Comment Edited] (NUTCH-2669) Reliable solution for javax.ws packaging.type - posted by "Lewis John McGibbney (Jira)" <ji...@apache.org> on 2020/07/29 05:22:00 UTC, 2 replies.
- [GitHub] [nutch] balashashanka merged pull request #540: NUTCH-2805: Rename plugin urlfilter-domainblacklist - posted by GitBox <gi...@apache.org> on 2020/07/29 17:47:19 UTC, 0 replies.
- Re: Setting up automatic tests and check in GIT - posted by lewis john mcgibbney <le...@apache.org> on 2020/07/30 16:54:30 UTC, 1 replies.
- [jira] [Created] (NUTCH-2811) Setup Github workflows for PR - posted by "Shashanka Balakuntala Srinivasa (Jira)" <ji...@apache.org> on 2020/07/31 05:22:00 UTC, 0 replies.
- [jira] [Updated] (NUTCH-2811) Setup Github workflows for PR - posted by "Shashanka Balakuntala Srinivasa (Jira)" <ji...@apache.org> on 2020/07/31 06:16:00 UTC, 0 replies.
- [GitHub] [nutch] madhawa-gunasekara opened a new pull request #543: NUTCH-2811 : Setup Github workflows for prs - posted by GitBox <gi...@apache.org> on 2020/07/31 13:57:04 UTC, 0 replies.
- [jira] [Commented] (NUTCH-2811) Setup Github workflows for PR - posted by "ASF GitHub Bot (Jira)" <ji...@apache.org> on 2020/07/31 13:58:00 UTC, 10 replies.
- [GitHub] [nutch] balashashanka commented on a change in pull request #543: NUTCH-2811 : Setup Github workflows for prs - posted by GitBox <gi...@apache.org> on 2020/07/31 15:47:10 UTC, 2 replies.
- [GitHub] [nutch] madhawa-gunasekara commented on a change in pull request #543: NUTCH-2811 : Setup Github workflows for prs - posted by GitBox <gi...@apache.org> on 2020/07/31 16:10:21 UTC, 3 replies.
- [GitHub] [nutch] jorgelbg commented on a change in pull request #543: NUTCH-2811 : Setup Github workflows for prs - posted by GitBox <gi...@apache.org> on 2020/07/31 18:01:42 UTC, 0 replies.
- [GitHub] [nutch] lewismc commented on pull request #543: NUTCH-2811 : Setup Github workflows for prs - posted by GitBox <gi...@apache.org> on 2020/07/31 21:10:53 UTC, 0 replies.