You are viewing a plain text version of this content. The canonical link for it is here.
- jobid not fit the date - posted by lujinhong <lu...@yahoo.com> on 2015/03/01 06:43:08 UTC, 0 replies.
- FW: Curating Issues - posted by "Mattmann, Chris A (3980)" <ch...@jpl.nasa.gov> on 2015/03/01 21:20:48 UTC, 0 replies.
- Re: Review Request 31579: Patch fo NUTCH-1949: Dump out the Nuth data into the Common Crawl format - posted by Giuseppe Totaro <to...@gmail.com> on 2015/03/02 03:45:52 UTC, 5 replies.
- Nutch does not parse rss feed file - posted by Jamal Nasir <mj...@gmail.com> on 2015/03/02 16:37:57 UTC, 0 replies.
- [Nutch Wiki] Trivial Update of "NutchTutorial" by LewisJohnMcgibbney - posted by Apache Wiki <wi...@apache.org> on 2015/03/02 18:53:05 UTC, 1 replies.
- [jira] [Commented] (NUTCH-1949) Dump out the Nuth data into the Common Crawl format - posted by "Lewis John McGibbney (JIRA)" <ji...@apache.org> on 2015/03/02 21:02:04 UTC, 4 replies.
- unsubscribe - posted by ch...@knowledge-stream.com on 2015/03/02 22:35:36 UTC, 0 replies.
- [jira] [Commented] (NUTCH-1936) GSoC 2015 - Move Nutch to Hadoop 2.X - posted by "Petr Shypila (JIRA)" <ji...@apache.org> on 2015/03/03 06:26:05 UTC, 8 replies.
- [jira] [Resolved] (NUTCH-1921) Optionally disable HTTP if-modified-since header - posted by "Markus Jelsma (JIRA)" <ji...@apache.org> on 2015/03/03 14:18:04 UTC, 0 replies.
- [jira] [Updated] (NUTCH-1946) Upgrade to Gora 0.6.1 - posted by "Lewis John McGibbney (JIRA)" <ji...@apache.org> on 2015/03/03 23:18:05 UTC, 1 replies.
- Re: Review Request 31579: Patch for NUTCH-1949: Dump out the Nuth data into the Common Crawl format - posted by Giuseppe Totaro <to...@gmail.com> on 2015/03/04 01:19:56 UTC, 2 replies.
- [GitHub] nutch pull request: fix for NUTCH-1950 contributed by xzjh - posted by asfgit <gi...@git.apache.org> on 2015/03/04 03:20:50 UTC, 0 replies.
- [jira] [Commented] (NUTCH-1950) File name too long when bin/nutch dump - posted by "ASF GitHub Bot (JIRA)" <ji...@apache.org> on 2015/03/04 03:21:04 UTC, 1 replies.
- [jira] [Resolved] (NUTCH-1950) File name too long when bin/nutch dump - posted by "Chris A. Mattmann (JIRA)" <ji...@apache.org> on 2015/03/04 03:22:04 UTC, 0 replies.
- [jira] [Updated] (NUTCH-1949) Dump out the Nuth data into the Common Crawl format - posted by "Chris A. Mattmann (JIRA)" <ji...@apache.org> on 2015/03/04 05:13:05 UTC, 2 replies.
- [jira] [Commented] (NUTCH-1946) Upgrade to Gora 0.6.1 - posted by "Jeroen Vlek (JIRA)" <ji...@apache.org> on 2015/03/04 13:07:04 UTC, 10 replies.
- [jira] [Resolved] (NUTCH-1949) Dump out the Nuth data into the Common Crawl format - posted by "Lewis John McGibbney (JIRA)" <ji...@apache.org> on 2015/03/04 19:49:39 UTC, 0 replies.
- [jira] [Updated] (NUTCH-1949) Dump out the Nutch data into the Common Crawl format - posted by "Chris A. Mattmann (JIRA)" <ji...@apache.org> on 2015/03/04 20:08:38 UTC, 0 replies.
- [jira] [Commented] (NUTCH-1949) Dump out the Nutch data into the Common Crawl format - posted by "Hudson (JIRA)" <ji...@apache.org> on 2015/03/04 22:31:39 UTC, 0 replies.
- [jira] [Comment Edited] (NUTCH-1946) Upgrade to Gora 0.6.1 - posted by "Jeroen Vlek (JIRA)" <ji...@apache.org> on 2015/03/05 12:46:40 UTC, 2 replies.
- Fwd: Google Summer of Code 2015 Mentor Registration - posted by Lewis John Mcgibbney <le...@gmail.com> on 2015/03/06 21:01:36 UTC, 3 replies.
- [jira] [Created] (NUTCH-1954) FilenameTooLong error appears in CommonCrawlDumper - posted by "Chris A. Mattmann (JIRA)" <ji...@apache.org> on 2015/03/07 05:31:38 UTC, 0 replies.
- [jira] [Work started] (NUTCH-1954) FilenameTooLong error appears in CommonCrawlDumper - posted by "Chris A. Mattmann (JIRA)" <ji...@apache.org> on 2015/03/07 05:32:38 UTC, 0 replies.
- [jira] [Commented] (NUTCH-1954) FilenameTooLong error appears in CommonCrawlDumper - posted by "Chris A. Mattmann (JIRA)" <ji...@apache.org> on 2015/03/07 05:32:38 UTC, 5 replies.
- [GitHub] nutch pull request: Fix for NUTCH-1954: FilenameTooLong error appe... - posted by chrismattmann <gi...@git.apache.org> on 2015/03/07 05:47:58 UTC, 3 replies.
- [jira] [Resolved] (NUTCH-1954) FilenameTooLong error appears in CommonCrawlDumper - posted by "Chris A. Mattmann (JIRA)" <ji...@apache.org> on 2015/03/07 05:56:38 UTC, 0 replies.
- [Nutch Wiki] Update of "ContributorsGroup" by ChrisMattmann - posted by Apache Wiki <wi...@apache.org> on 2015/03/08 21:38:55 UTC, 0 replies.
- title inside body problem - posted by Zein Shaheen <ze...@gmail.com> on 2015/03/09 00:12:07 UTC, 2 replies.
- [jira] [Commented] (NUTCH-1948) Make the Selenium remote web driver specification, configuration and selection available via a Factory-type mechanism - posted by "Mo Omer (JIRA)" <ji...@apache.org> on 2015/03/09 17:50:38 UTC, 1 replies.
- [Nutch Wiki] Update of "GiuseppeTotaro" by GiuseppeTotaro - posted by Apache Wiki <wi...@apache.org> on 2015/03/09 20:39:17 UTC, 0 replies.
- [Nutch Wiki] Update of "CommonCrawlDataDumper" by GiuseppeTotaro - posted by Apache Wiki <wi...@apache.org> on 2015/03/10 00:53:27 UTC, 1 replies.
- Handling servers with wrong Last Modified HTTP header - posted by Jorge Luis Betancourt González <jl...@uci.cu> on 2015/03/10 04:23:03 UTC, 1 replies.
- Filter rejecting url - posted by Siddhartha Sandhu <si...@icloud.com> on 2015/03/10 10:40:25 UTC, 1 replies.
- [jira] [Created] (NUTCH-1955) ByteWritable missing in NutchWritable - posted by "Markus Jelsma (JIRA)" <ji...@apache.org> on 2015/03/10 11:54:38 UTC, 0 replies.
- [jira] [Updated] (NUTCH-1955) ByteWritable missing in NutchWritable - posted by "Markus Jelsma (JIRA)" <ji...@apache.org> on 2015/03/10 11:54:39 UTC, 0 replies.
- [jira] [Created] (NUTCH-1956) Members to be public in URLCrawlDatum - posted by "Markus Jelsma (JIRA)" <ji...@apache.org> on 2015/03/10 15:46:38 UTC, 0 replies.
- [jira] [Updated] (NUTCH-1956) Members to be public in URLCrawlDatum - posted by "Markus Jelsma (JIRA)" <ji...@apache.org> on 2015/03/10 15:46:39 UTC, 0 replies.
- [Nutch Wiki] New attachment added to page GiuseppeTotaro - posted by Apache Wiki <wi...@apache.org> on 2015/03/10 22:10:31 UTC, 1 replies.
- [Nutch Wiki] New attachment added to page CommonCrawlDataDumper - posted by Apache Wiki <wi...@apache.org> on 2015/03/10 23:13:44 UTC, 2 replies.
- [Nutch Wiki] Trivial Update of "ContributorsGroup" by LewisJohnMcgibbney - posted by Apache Wiki <wi...@apache.org> on 2015/03/11 00:28:46 UTC, 3 replies.
- [Nutch Wiki] Trivial Update of "GoogleSummerOfCode" by LewisJohnMcgibbney - posted by Apache Wiki <wi...@apache.org> on 2015/03/11 00:37:14 UTC, 0 replies.
- [Nutch Wiki] Trivial Update of "CommandLineOptions" by LewisJohnMcgibbney - posted by Apache Wiki <wi...@apache.org> on 2015/03/11 01:16:56 UTC, 0 replies.
- Re: GSOC 2015, Introduction and Project of Interest - posted by Lewis John Mcgibbney <le...@gmail.com> on 2015/03/11 01:43:15 UTC, 2 replies.
- [jira] [Issue Comment Deleted] (NUTCH-1936) GSoC 2015 - Move Nutch to Hadoop 2.X - posted by "Ashwini Tokekar (JIRA)" <ji...@apache.org> on 2015/03/11 05:54:38 UTC, 1 replies.
- [jira] [Created] (NUTCH-1957) FileDumper output file name collisions - posted by "Renxia Wang (JIRA)" <ji...@apache.org> on 2015/03/11 08:04:38 UTC, 0 replies.
- [jira] [Created] (NUTCH-1958) Remove scoring-opic from nutch-default.xml - posted by "Markus Jelsma (JIRA)" <ji...@apache.org> on 2015/03/11 10:59:38 UTC, 0 replies.
- [jira] [Commented] (NUTCH-1932) Automatically remove orphaned pages - posted by "Markus Jelsma (JIRA)" <ji...@apache.org> on 2015/03/11 11:52:38 UTC, 0 replies.
- [jira] [Commented] (NUTCH-1957) FileDumper output file name collisions - posted by "Giuseppe Totaro (JIRA)" <ji...@apache.org> on 2015/03/11 17:26:38 UTC, 6 replies.
- [jira] [Created] (NUTCH-1959) Improving CommonCrawlFormat implementations - posted by "Giuseppe Totaro (JIRA)" <ji...@apache.org> on 2015/03/11 18:04:40 UTC, 0 replies.
- [jira] [Updated] (NUTCH-1959) Improving CommonCrawlFormat implementations - posted by "Giuseppe Totaro (JIRA)" <ji...@apache.org> on 2015/03/11 18:05:39 UTC, 1 replies.
- [jira] [Updated] (NUTCH-1960) JUnit test for dump method of CommonCrawlDataDumper - posted by "Giuseppe Totaro (JIRA)" <ji...@apache.org> on 2015/03/11 18:11:38 UTC, 3 replies.
- [jira] [Created] (NUTCH-1960) JUnit test for dump method of CommonCrawlDataDumper - posted by "Giuseppe Totaro (JIRA)" <ji...@apache.org> on 2015/03/11 18:11:38 UTC, 0 replies.
- [jira] [Created] (NUTCH-1961) Provide multipart compression of Common Crawl data - posted by "Giuseppe Totaro (JIRA)" <ji...@apache.org> on 2015/03/11 18:20:38 UTC, 0 replies.
- [jira] [Created] (NUTCH-1962) Need to have mimetype-filter.txt file available by default - posted by "Lewis John McGibbney (JIRA)" <ji...@apache.org> on 2015/03/11 23:48:38 UTC, 0 replies.
- [jira] [Commented] (NUTCH-1962) Need to have mimetype-filter.txt file available by default - posted by "Jorge Luis Betancourt Gonzalez (JIRA)" <ji...@apache.org> on 2015/03/12 01:42:38 UTC, 4 replies.
- [jira] [Work started] (NUTCH-1959) Improving CommonCrawlFormat implementations - posted by "Chris A. Mattmann (JIRA)" <ji...@apache.org> on 2015/03/12 07:10:39 UTC, 0 replies.
- [jira] [Assigned] (NUTCH-1959) Improving CommonCrawlFormat implementations - posted by "Chris A. Mattmann (JIRA)" <ji...@apache.org> on 2015/03/12 07:10:39 UTC, 0 replies.
- [jira] [Work started] (NUTCH-1960) JUnit test for dump method of CommonCrawlDataDumper - posted by "Chris A. Mattmann (JIRA)" <ji...@apache.org> on 2015/03/12 07:10:46 UTC, 0 replies.
- [jira] [Assigned] (NUTCH-1960) JUnit test for dump method of CommonCrawlDataDumper - posted by "Chris A. Mattmann (JIRA)" <ji...@apache.org> on 2015/03/12 07:10:46 UTC, 0 replies.
- HTTP Post Authentication - posted by Tizy Ninan <ti...@gmail.com> on 2015/03/12 07:59:08 UTC, 4 replies.
- [jira] [Commented] (NUTCH-1956) Members to be public in URLCrawlDatum - posted by "Sebastian Nagel (JIRA)" <ji...@apache.org> on 2015/03/12 10:11:38 UTC, 1 replies.
- [GitHub] nutch pull request: NUTCH-1957 using MD5 as part of file path to s... - posted by renxiawang <gi...@git.apache.org> on 2015/03/12 11:04:57 UTC, 1 replies.
- [jira] [Created] (NUTCH-1963) CommonsCrawlDataDumper is too long ( > 100 bytes) when -gzip option invoked - posted by "Lewis John McGibbney (JIRA)" <ji...@apache.org> on 2015/03/12 19:14:38 UTC, 0 replies.
- [jira] [Commented] (NUTCH-1963) CommonsCrawlDataDumper is too long ( > 100 bytes) when -gzip option invoked - posted by "Giuseppe Totaro (JIRA)" <ji...@apache.org> on 2015/03/12 19:28:38 UTC, 0 replies.
- [jira] [Updated] (NUTCH-1957) FileDumper output file name collisions - posted by "Renxia Wang (JIRA)" <ji...@apache.org> on 2015/03/13 00:31:38 UTC, 1 replies.
- [Nutch Wiki] Update of "Nutch_1.X_RESTAPI" by SujenShah - posted by Apache Wiki <wi...@apache.org> on 2015/03/13 02:42:44 UTC, 0 replies.
- [jira] [Updated] (NUTCH-1962) Need to have mimetype-filter.txt file available by default - posted by "Jorge Luis Betancourt Gonzalez (JIRA)" <ji...@apache.org> on 2015/03/13 04:39:38 UTC, 0 replies.
- [jira] [Resolved] (NUTCH-1956) Members to be public in URLCrawlDatum - posted by "Markus Jelsma (JIRA)" <ji...@apache.org> on 2015/03/13 15:57:38 UTC, 0 replies.
- [jira] [Resolved] (NUTCH-1955) ByteWritable missing in NutchWritable - posted by "Markus Jelsma (JIRA)" <ji...@apache.org> on 2015/03/13 15:58:38 UTC, 0 replies.
- [jira] [Commented] (NUTCH-1955) ByteWritable missing in NutchWritable - posted by "Hudson (JIRA)" <ji...@apache.org> on 2015/03/13 16:51:38 UTC, 0 replies.
- [jira] [Created] (NUTCH-1964) tmp directory not cleaned up after using commoncrawldump tool - posted by "Lewis John McGibbney (JIRA)" <ji...@apache.org> on 2015/03/13 18:58:39 UTC, 0 replies.
- Fwd: GSoC 2015 Proposal for Nutch -1936 ( Move Nutch to Hadoop 2.X) - posted by Suman Saurabh <ss...@gmail.com> on 2015/03/14 21:06:29 UTC, 0 replies.
- [jira] [Assigned] (NUTCH-1957) FileDumper output file name collisions - posted by "Chris A. Mattmann (JIRA)" <ji...@apache.org> on 2015/03/15 05:21:39 UTC, 0 replies.
- [jira] [Work started] (NUTCH-1957) FileDumper output file name collisions - posted by "Chris A. Mattmann (JIRA)" <ji...@apache.org> on 2015/03/15 05:21:39 UTC, 0 replies.
- [jira] [Resolved] (NUTCH-1957) FileDumper output file name collisions - posted by "Chris A. Mattmann (JIRA)" <ji...@apache.org> on 2015/03/15 05:25:39 UTC, 0 replies.
- [jira] [Updated] (NUTCH-1741) Support of Sitemaps in Nutch 2.x - posted by "Talat UYARER (JIRA)" <ji...@apache.org> on 2015/03/16 10:44:38 UTC, 1 replies.
- [jira] [Created] (NUTCH-1965) My - posted by "LEARNING TESTING (JIRA)" <ji...@apache.org> on 2015/03/16 11:45:38 UTC, 0 replies.
- [jira] [Closed] (NUTCH-1965) My - posted by "Julien Nioche (JIRA)" <ji...@apache.org> on 2015/03/16 11:58:38 UTC, 0 replies.
- Fwd: Hbase 0.94.24 hadoop 2.5.0 Gora 0.4 and Nutch 2.3 failing at inject - posted by Siddhartha Sandhu <si...@icloud.com> on 2015/03/16 16:57:30 UTC, 0 replies.
- Re: Project Proposal - posted by Lewis John Mcgibbney <le...@gmail.com> on 2015/03/16 17:29:14 UTC, 0 replies.
- [jira] [Updated] (NUTCH-1941) Optional rolling http.agent.name's - posted by "Asitang Mishra (JIRA)" <ji...@apache.org> on 2015/03/16 23:45:39 UTC, 10 replies.
- [jira] [Comment Edited] (NUTCH-1941) Optional rolling http.agent.name's - posted by "Asitang Mishra (JIRA)" <ji...@apache.org> on 2015/03/16 23:46:38 UTC, 6 replies.
- [jira] [Commented] (NUTCH-1941) Optional rolling http.agent.name's - posted by "Markus Jelsma (JIRA)" <ji...@apache.org> on 2015/03/17 00:19:38 UTC, 21 replies.
- [jira] [Created] (NUTCH-1966) Configuration endpoint for 1x REST API [A sub-issue of NUTCH-1931] - posted by "Sujen Shah (JIRA)" <ji...@apache.org> on 2015/03/17 08:31:38 UTC, 0 replies.
- [jira] [Updated] (NUTCH-1966) Configuration endpoint for 1x REST API [A sub-issue of NUTCH-1931] - posted by "Sujen Shah (JIRA)" <ji...@apache.org> on 2015/03/17 08:33:38 UTC, 0 replies.
- [GitHub] nutch pull request: fix for Nutch-1966 contributed by sujen1412 - posted by sujen1412 <gi...@git.apache.org> on 2015/03/17 09:54:50 UTC, 1 replies.
- [jira] [Commented] (NUTCH-1931) Apache Nutch 1.x REST service and crawler visualization - posted by "Lewis John McGibbney (JIRA)" <ji...@apache.org> on 2015/03/17 10:07:38 UTC, 2 replies.
- [jira] [Created] (NUTCH-1967) Possible SIooBE in MimeAdaptiveFetchSchedule - posted by "Markus Jelsma (JIRA)" <ji...@apache.org> on 2015/03/17 16:36:39 UTC, 0 replies.
- [jira] [Updated] (NUTCH-1967) Possible SIooBE in MimeAdaptiveFetchSchedule - posted by "Markus Jelsma (JIRA)" <ji...@apache.org> on 2015/03/17 16:37:39 UTC, 4 replies.
- [jira] [Commented] (NUTCH-1966) Configuration endpoint for 1x REST API [A sub-issue of NUTCH-1931] - posted by "Sujen Shah (JIRA)" <ji...@apache.org> on 2015/03/17 17:58:39 UTC, 5 replies.
- [jira] [Commented] (NUTCH-1967) Possible SIooBE in MimeAdaptiveFetchSchedule - posted by "Sebastian Nagel (JIRA)" <ji...@apache.org> on 2015/03/17 21:43:39 UTC, 1 replies.
- [jira] [Created] (NUTCH-1968) File Name too long issue of DumpFileUtil.java file - posted by "Xin Zhang (JIRA)" <ji...@apache.org> on 2015/03/18 08:52:38 UTC, 0 replies.
- [jira] [Resolved] (NUTCH-1967) Possible SIooBE in MimeAdaptiveFetchSchedule - posted by "Markus Jelsma (JIRA)" <ji...@apache.org> on 2015/03/18 09:05:38 UTC, 0 replies.
- [jira] [Created] (NUTCH-1969) URL Normalizer properly handling slashes - posted by "Markus Jelsma (JIRA)" <ji...@apache.org> on 2015/03/18 09:13:38 UTC, 0 replies.
- [jira] [Updated] (NUTCH-1969) URL Normalizer properly handling slashes - posted by "Markus Jelsma (JIRA)" <ji...@apache.org> on 2015/03/18 09:22:38 UTC, 1 replies.
- [jira] [Commented] (NUTCH-1968) File Name too long issue of DumpFileUtil.java file - posted by "Renxia Wang (JIRA)" <ji...@apache.org> on 2015/03/18 18:02:38 UTC, 8 replies.
- [jira] [Updated] (NUTCH-1968) File Name too long issue of DumpFileUtil.java file - posted by "Renxia Wang (JIRA)" <ji...@apache.org> on 2015/03/18 18:02:38 UTC, 0 replies.
- [GitHub] nutch pull request: NUTCH-1968 resolved file extension too long is... - posted by renxiawang <gi...@git.apache.org> on 2015/03/18 18:15:35 UTC, 1 replies.
- [Nutch Wiki] Trivial Update of "Nutch_1.X_RESTAPI" by SujenShah - posted by Apache Wiki <wi...@apache.org> on 2015/03/19 01:22:31 UTC, 0 replies.
- [jira] [Assigned] (NUTCH-1966) Configuration endpoint for 1x REST API [A sub-issue of NUTCH-1931] - posted by "Chris A. Mattmann (JIRA)" <ji...@apache.org> on 2015/03/19 04:07:39 UTC, 0 replies.
- [jira] [Work started] (NUTCH-1966) Configuration endpoint for 1x REST API [A sub-issue of NUTCH-1931] - posted by "Chris A. Mattmann (JIRA)" <ji...@apache.org> on 2015/03/19 04:07:39 UTC, 0 replies.
- [jira] [Work started] (NUTCH-1968) File Name too long issue of DumpFileUtil.java file - posted by "Chris A. Mattmann (JIRA)" <ji...@apache.org> on 2015/03/19 04:10:38 UTC, 0 replies.
- [jira] [Assigned] (NUTCH-1968) File Name too long issue of DumpFileUtil.java file - posted by "Chris A. Mattmann (JIRA)" <ji...@apache.org> on 2015/03/19 04:10:38 UTC, 0 replies.
- [jira] [Resolved] (NUTCH-1966) Configuration endpoint for 1x REST API [A sub-issue of NUTCH-1931] - posted by "Chris A. Mattmann (JIRA)" <ji...@apache.org> on 2015/03/19 04:51:38 UTC, 0 replies.
- [jira] [Work started] (NUTCH-1970) Pretty print JSON output in config resource - posted by "Chris A. Mattmann (JIRA)" <ji...@apache.org> on 2015/03/19 04:52:38 UTC, 0 replies.
- [jira] [Created] (NUTCH-1970) Pretty print JSON output in config resource - posted by "Chris A. Mattmann (JIRA)" <ji...@apache.org> on 2015/03/19 04:52:38 UTC, 0 replies.
- [jira] [Resolved] (NUTCH-1968) File Name too long issue of DumpFileUtil.java file - posted by "Chris A. Mattmann (JIRA)" <ji...@apache.org> on 2015/03/19 05:23:38 UTC, 0 replies.
- [jira] [Commented] (NUTCH-1741) Support of Sitemaps in Nutch 2.x - posted by "cihad güzel (JIRA)" <ji...@apache.org> on 2015/03/19 21:52:38 UTC, 0 replies.
- [jira] [Created] (NUTCH-1971) The crawldb.url.filters property is not present in any configuration file - posted by "Luis Lopez (JIRA)" <ji...@apache.org> on 2015/03/19 23:51:38 UTC, 0 replies.
- [jira] [Created] (NUTCH-1972) Dockerfile for Nutch 1.x - posted by "Michael Joyce (JIRA)" <ji...@apache.org> on 2015/03/20 02:44:38 UTC, 0 replies.
- [jira] [Updated] (NUTCH-1972) Dockerfile for Nutch 1.x - posted by "Michael Joyce (JIRA)" <ji...@apache.org> on 2015/03/20 02:45:38 UTC, 1 replies.
- [jira] [Commented] (NUTCH-1972) Dockerfile for Nutch 1.x - posted by "Chris A. Mattmann (JIRA)" <ji...@apache.org> on 2015/03/20 03:01:38 UTC, 0 replies.
- [jira] [Assigned] (NUTCH-1972) Dockerfile for Nutch 1.x - posted by "Chris A. Mattmann (JIRA)" <ji...@apache.org> on 2015/03/20 03:01:39 UTC, 0 replies.
- [jira] [Work started] (NUTCH-1972) Dockerfile for Nutch 1.x - posted by "Chris A. Mattmann (JIRA)" <ji...@apache.org> on 2015/03/20 03:01:39 UTC, 0 replies.
- [jira] [Created] (NUTCH-1973) Job Administration end point for the REST service - posted by "Sujen Shah (JIRA)" <ji...@apache.org> on 2015/03/20 09:15:38 UTC, 0 replies.
- [jira] [Commented] (NUTCH-1971) The crawldb.url.filters property is not present in any configuration file - posted by "Sebastian Nagel (JIRA)" <ji...@apache.org> on 2015/03/20 10:05:38 UTC, 1 replies.
- multiple parses from one page - posted by Mahmoud Gzawi <gz...@gmail.com> on 2015/03/20 14:52:47 UTC, 3 replies.
- [jira] [Comment Edited] (NUTCH-1971) The crawldb.url.filters property is not present in any configuration file - posted by "Luis Lopez (JIRA)" <ji...@apache.org> on 2015/03/20 18:47:38 UTC, 0 replies.
- Problem with redirection - posted by Mahmoud Gzawi <gz...@gmail.com> on 2015/03/20 22:56:07 UTC, 2 replies.
- [jira] [Resolved] (NUTCH-1962) Need to have mimetype-filter.txt file available by default - posted by "Jorge Luis Betancourt Gonzalez (JIRA)" <ji...@apache.org> on 2015/03/21 07:57:38 UTC, 0 replies.
- [jira] [Commented] (NUTCH-1958) Remove scoring-opic from nutch-default.xml - posted by "Jorge Luis Betancourt Gonzalez (JIRA)" <ji...@apache.org> on 2015/03/21 08:00:47 UTC, 4 replies.
- TestGDALParser.testParseBasicInfo and TestGDALParser.testParseMetadata errors - posted by Anvesha Sinha <an...@usc.edu> on 2015/03/22 06:24:09 UTC, 3 replies.
- [ANNOUNCE] New Nutch committer and PMC - Mo Omer - posted by Sebastian Nagel <wa...@googlemail.com> on 2015/03/22 10:40:36 UTC, 4 replies.
- [jira] [Created] (NUTCH-1974) keyPrefix option for CommonCrawlDataDumper tool - posted by "Giuseppe Totaro (JIRA)" <ji...@apache.org> on 2015/03/23 20:17:54 UTC, 0 replies.
- [jira] [Updated] (NUTCH-1974) keyPrefix option for CommonCrawlDataDumper tool - posted by "Giuseppe Totaro (JIRA)" <ji...@apache.org> on 2015/03/23 20:18:53 UTC, 3 replies.
- Crawl images and store locally - posted by Tizy Ninan <ti...@gmail.com> on 2015/03/24 07:12:16 UTC, 2 replies.
- Issue related to fetcher.parse property - posted by Asitang Mishra <as...@usc.edu> on 2015/03/24 09:07:57 UTC, 0 replies.
- [jira] [Work started] (NUTCH-1974) keyPrefix option for CommonCrawlDataDumper tool - posted by "Chris A. Mattmann (JIRA)" <ji...@apache.org> on 2015/03/24 14:57:52 UTC, 0 replies.
- [jira] [Assigned] (NUTCH-1974) keyPrefix option for CommonCrawlDataDumper tool - posted by "Chris A. Mattmann (JIRA)" <ji...@apache.org> on 2015/03/24 14:57:52 UTC, 0 replies.
- [jira] [Commented] (NUTCH-1974) keyPrefix option for CommonCrawlDataDumper tool - posted by "Chris A. Mattmann (JIRA)" <ji...@apache.org> on 2015/03/24 15:03:53 UTC, 5 replies.
- GSoC 2015 Proposal for Nutch-1936 - posted by Suman Saurabh <ss...@gmail.com> on 2015/03/25 19:06:02 UTC, 0 replies.
- Re: Review Request 32451: keyPrefix option for CommonCrawlDataDumper tool - posted by Giuseppe Totaro <to...@gmail.com> on 2015/03/26 03:47:06 UTC, 1 replies.
- [jira] [Resolved] (NUTCH-1974) keyPrefix option for CommonCrawlDataDumper tool - posted by "Chris A. Mattmann (JIRA)" <ji...@apache.org> on 2015/03/26 04:01:52 UTC, 0 replies.
- [jira] [Commented] (NUTCH-1959) Improving CommonCrawlFormat implementations - posted by "Chris A. Mattmann (JIRA)" <ji...@apache.org> on 2015/03/26 05:33:54 UTC, 2 replies.
- [DEADLINE] Google Summer of Code Deadline Approaching Soon - posted by Lewis John Mcgibbney <le...@gmail.com> on 2015/03/26 05:35:31 UTC, 4 replies.
- [jira] [Resolved] (NUTCH-1959) Improving CommonCrawlFormat implementations - posted by "Chris A. Mattmann (JIRA)" <ji...@apache.org> on 2015/03/26 05:36:53 UTC, 0 replies.
- [jira] [Updated] (NUTCH-1325) HostDB for Nutch - posted by "Markus Jelsma (JIRA)" <ji...@apache.org> on 2015/03/26 16:09:53 UTC, 3 replies.
- [jira] [Commented] (NUTCH-1325) HostDB for Nutch - posted by "Lewis John McGibbney (JIRA)" <ji...@apache.org> on 2015/03/26 20:24:53 UTC, 3 replies.
- [jira] [Comment Edited] (NUTCH-1741) Support of Sitemaps in Nutch 2.x - posted by "cihad güzel (JIRA)" <ji...@apache.org> on 2015/03/27 09:44:53 UTC, 0 replies.
- GSOC RDF Microformats Support - posted by Remzi Düzağaç <re...@gmail.com> on 2015/03/27 13:07:51 UTC, 3 replies.
- GSOC Sentence Detection and Named Entity Recognize - posted by ilhami kalkan <il...@gmail.com> on 2015/03/27 17:18:19 UTC, 0 replies.
- HTTP POST Authentication - posted by Tyler Palsulich <tp...@gmail.com> on 2015/03/27 18:31:48 UTC, 0 replies.
- [jira] [Assigned] (NUTCH-1941) Optional rolling http.agent.name's - posted by "Sebastian Nagel (JIRA)" <ji...@apache.org> on 2015/03/27 21:57:53 UTC, 0 replies.
- [jira] [Work started] (NUTCH-1941) Optional rolling http.agent.name's - posted by "Sebastian Nagel (JIRA)" <ji...@apache.org> on 2015/03/27 21:58:52 UTC, 0 replies.
- [jira] [Resolved] (NUTCH-1941) Optional rolling http.agent.name's - posted by "Sebastian Nagel (JIRA)" <ji...@apache.org> on 2015/03/27 22:47:54 UTC, 0 replies.
- [jira] [Created] (NUTCH-1975) New configuration for CommonCrawlDataDumper tool - posted by "Giuseppe Totaro (JIRA)" <ji...@apache.org> on 2015/03/28 01:48:52 UTC, 0 replies.
- [jira] [Updated] (NUTCH-1975) New configuration for CommonCrawlDataDumper tool - posted by "Giuseppe Totaro (JIRA)" <ji...@apache.org> on 2015/03/28 01:49:52 UTC, 0 replies.
- [jira] [Assigned] (NUTCH-1975) New configuration for CommonCrawlDataDumper tool - posted by "Chris A. Mattmann (JIRA)" <ji...@apache.org> on 2015/03/28 03:57:52 UTC, 0 replies.
- [jira] [Work started] (NUTCH-1975) New configuration for CommonCrawlDataDumper tool - posted by "Chris A. Mattmann (JIRA)" <ji...@apache.org> on 2015/03/28 03:57:53 UTC, 0 replies.
- [jira] [Commented] (NUTCH-1975) New configuration for CommonCrawlDataDumper tool - posted by "Chris A. Mattmann (JIRA)" <ji...@apache.org> on 2015/03/28 03:58:52 UTC, 1 replies.
- [jira] [Comment Edited] (NUTCH-1325) HostDB for Nutch - posted by "Jorge Luis Betancourt Gonzalez (JIRA)" <ji...@apache.org> on 2015/03/28 06:01:53 UTC, 0 replies.
- [jira] [Commented] (NUTCH-1970) Pretty print JSON output in config resource - posted by "Chris A. Mattmann (JIRA)" <ji...@apache.org> on 2015/03/28 17:29:52 UTC, 2 replies.
- [jira] [Updated] (NUTCH-1970) Pretty print JSON output in config resource - posted by "Chris A. Mattmann (JIRA)" <ji...@apache.org> on 2015/03/28 17:29:53 UTC, 1 replies.
- [jira] [Created] (NUTCH-1976) Allow Users to Set Hostname for Server - posted by "Tyler Palsulich (JIRA)" <ji...@apache.org> on 2015/03/29 04:18:52 UTC, 0 replies.
- [jira] [Updated] (NUTCH-1976) Allow Users to Set Hostname for Server - posted by "Tyler Palsulich (JIRA)" <ji...@apache.org> on 2015/03/29 04:39:52 UTC, 1 replies.
- [jira] [Work started] (NUTCH-1976) Allow Users to Set Hostname for Server - posted by "Chris A. Mattmann (JIRA)" <ji...@apache.org> on 2015/03/29 07:15:52 UTC, 0 replies.
- [jira] [Assigned] (NUTCH-1976) Allow Users to Set Hostname for Server - posted by "Chris A. Mattmann (JIRA)" <ji...@apache.org> on 2015/03/29 07:15:52 UTC, 0 replies.
- [jira] [Commented] (NUTCH-1976) Allow Users to Set Hostname for Server - posted by "Chris A. Mattmann (JIRA)" <ji...@apache.org> on 2015/03/29 07:15:53 UTC, 1 replies.
- [jira] [Resolved] (NUTCH-1976) Allow Users to Set Hostname for Server - posted by "Chris A. Mattmann (JIRA)" <ji...@apache.org> on 2015/03/29 07:17:54 UTC, 0 replies.
- [jira] [Resolved] (NUTCH-1970) Pretty print JSON output in config resource - posted by "Chris A. Mattmann (JIRA)" <ji...@apache.org> on 2015/03/29 07:29:53 UTC, 0 replies.
- [jira] [Created] (NUTCH-1977) commoncrawldump java heap space - posted by "Jiaheng Zhang (JIRA)" <ji...@apache.org> on 2015/03/29 08:30:52 UTC, 0 replies.
- [jira] [Commented] (NUTCH-1977) commoncrawldump java heap space - posted by "Xin Zhang (JIRA)" <ji...@apache.org> on 2015/03/29 08:49:52 UTC, 0 replies.
- [Nutch Wiki] Update of "SumanSaurabh/GSoC2015Nutch" by SumanSaurabh - posted by Apache Wiki <wi...@apache.org> on 2015/03/29 15:50:10 UTC, 1 replies.
- [Nutch Wiki] Trivial Update of "MohitBagde" by MohitBagde - posted by Apache Wiki <wi...@apache.org> on 2015/03/29 20:07:07 UTC, 0 replies.
- [Nutch Wiki] Update of "ashwinitokekar" by ashwinitokekar - posted by Apache Wiki <wi...@apache.org> on 2015/03/29 21:32:34 UTC, 1 replies.
- [Nutch Wiki] Update of "SumanSaurabh" by SumanSaurabh - posted by Apache Wiki <wi...@apache.org> on 2015/03/30 11:55:36 UTC, 0 replies.
- [Nutch Wiki] Update of "NutchHBaseHiveMapping" by talat - posted by Apache Wiki <wi...@apache.org> on 2015/03/30 13:23:12 UTC, 0 replies.
- [Nutch Wiki] Update of "FrontPage" by talat - posted by Apache Wiki <wi...@apache.org> on 2015/03/30 13:37:26 UTC, 0 replies.
- [jira] [Created] (NUTCH-1978) solrindex will fail when indexing corrupted segments - posted by "Chong Li (JIRA)" <ji...@apache.org> on 2015/03/31 08:11:52 UTC, 0 replies.
- [jira] [Created] (NUTCH-1979) CrawlDbReader to implement Tool - posted by "Markus Jelsma (JIRA)" <ji...@apache.org> on 2015/03/31 11:11:53 UTC, 0 replies.
- [jira] [Updated] (NUTCH-1979) CrawlDbReader to implement Tool - posted by "Markus Jelsma (JIRA)" <ji...@apache.org> on 2015/03/31 11:12:52 UTC, 3 replies.
- [jira] [Commented] (NUTCH-1979) CrawlDbReader to implement Tool - posted by "Sebastian Nagel (JIRA)" <ji...@apache.org> on 2015/03/31 17:33:53 UTC, 3 replies.
- [jira] [Resolved] (NUTCH-1979) CrawlDbReader to implement Tool - posted by "Markus Jelsma (JIRA)" <ji...@apache.org> on 2015/03/31 19:35:54 UTC, 0 replies.
- Build failed in Jenkins: Nutch-trunk #3040 - posted by Apache Jenkins Server <je...@builds.apache.org> on 2015/03/31 19:44:31 UTC, 0 replies.
- Jenkins build is back to normal : Nutch-trunk #3041 - posted by Apache Jenkins Server <je...@builds.apache.org> on 2015/03/31 21:51:24 UTC, 0 replies.
- [jira] [Commented] (NUTCH-1771) Solrindex fails if a segment is corrupted or incomplete - posted by "Sebastian Nagel (JIRA)" <ji...@apache.org> on 2015/03/31 22:34:53 UTC, 0 replies.
- [jira] [Resolved] (NUTCH-1978) solrindex will fail when indexing corrupted segments - posted by "Sebastian Nagel (JIRA)" <ji...@apache.org> on 2015/03/31 22:36:54 UTC, 0 replies.
- [jira] [Updated] (NUTCH-1771) Solrindex fails if a segment is corrupted or incomplete - posted by "Sebastian Nagel (JIRA)" <ji...@apache.org> on 2015/03/31 22:36:55 UTC, 0 replies.
- Re: [DISCUSS] Release Apache Nutch 1.10 - posted by Sebastian Nagel <wa...@googlemail.com> on 2015/03/31 23:13:53 UTC, 0 replies.