You are viewing a plain text version of this content. The canonical link for it is here.
- Build failed in Jenkins: Nutch-trunk #1504 - posted by Apache Jenkins Server <hu...@hudson.apache.org> on 2011/06/01 06:12:10 UTC, 0 replies.
- Re: Questions about Jena in Nutch - posted by lfs <fa...@hotmail.com> on 2011/06/01 09:58:52 UTC, 1 replies.
- [jira] [Issue Comment Edited] (NUTCH-995) Generate POM file using the Ivy makepom task - posted by "Gabriele Kahlout (JIRA)" <ji...@apache.org> on 2011/06/01 11:41:47 UTC, 5 replies.
- [jira] [Commented] (NUTCH-995) Generate POM file using the Ivy makepom task - posted by "Gabriele Kahlout (JIRA)" <ji...@apache.org> on 2011/06/01 11:41:47 UTC, 22 replies.
- [jira] [Updated] (NUTCH-995) Generate POM file using the Ivy makepom task - posted by "Gabriele Kahlout (JIRA)" <ji...@apache.org> on 2011/06/01 11:43:47 UTC, 0 replies.
- [jira] [Created] (NUTCH-1001) bin/nutch fetch/parse handle crawl/segments directory - posted by "Gabriele Kahlout (JIRA)" <ji...@apache.org> on 2011/06/01 22:04:47 UTC, 0 replies.
- [jira] [Updated] (NUTCH-1001) bin/nutch fetch/parse handle crawl/segments directory - posted by "Gabriele Kahlout (JIRA)" <ji...@apache.org> on 2011/06/01 22:10:51 UTC, 2 replies.
- [jira] [Issue Comment Edited] (NUTCH-1001) bin/nutch fetch/parse handle crawl/segments directory - posted by "Gabriele Kahlout (JIRA)" <ji...@apache.org> on 2011/06/01 22:12:47 UTC, 0 replies.
- Build failed in Jenkins: Nutch-trunk #1505 - posted by Apache Jenkins Server <hu...@hudson.apache.org> on 2011/06/02 06:03:55 UTC, 0 replies.
- [jira] [Updated] (NUTCH-961) Expose Tika's boilerpipe support - posted by "Gabriele Kahlout (JIRA)" <ji...@apache.org> on 2011/06/02 09:19:47 UTC, 6 replies.
- [jira] [Created] (NUTCH-1002) Want to be able to filter url's through code, rather than through configuration file - crawl-urlfilter.txt - posted by "shantanu sardal (JIRA)" <ji...@apache.org> on 2011/06/02 23:37:48 UTC, 0 replies.
- Build failed in Jenkins: Nutch-trunk #1506 - posted by Apache Jenkins Server <hu...@hudson.apache.org> on 2011/06/03 06:03:23 UTC, 0 replies.
- [VOTE] Apache Nutch 1.3 Release Candidate #2 - posted by "Mattmann, Chris A (388J)" <ch...@jpl.nasa.gov> on 2011/06/04 06:02:00 UTC, 4 replies.
- Build failed in Jenkins: Nutch-trunk #1507 - posted by Apache Jenkins Server <hu...@hudson.apache.org> on 2011/06/04 06:12:10 UTC, 0 replies.
- [jira] [Resolved] (NUTCH-1002) Want to be able to filter url's through code, rather than through configuration file - crawl-urlfilter.txt - posted by "Julien Nioche (JIRA)" <ji...@apache.org> on 2011/06/04 10:14:47 UTC, 0 replies.
- [jira] [Created] (NUTCH-1003) 'package' task does not reflect the new organisation of the code - posted by "Julien Nioche (JIRA)" <ji...@apache.org> on 2011/06/04 10:18:47 UTC, 0 replies.
- [jira] [Updated] (NUTCH-1003) 'package' task does not reflect the new organisation of the code - posted by "Julien Nioche (JIRA)" <ji...@apache.org> on 2011/06/04 10:20:47 UTC, 1 replies.
- [jira] [Resolved] (NUTCH-1003) 'package' task does not reflect the new organisation of the code - posted by "Julien Nioche (JIRA)" <ji...@apache.org> on 2011/06/04 10:38:47 UTC, 0 replies.
- [jira] [Work started] (NUTCH-995) Generate POM file using the Ivy makepom task - posted by "Chris A. Mattmann (JIRA)" <ji...@apache.org> on 2011/06/04 19:22:47 UTC, 0 replies.
- [jira] [Resolved] (NUTCH-995) Generate POM file using the Ivy makepom task - posted by "Chris A. Mattmann (JIRA)" <ji...@apache.org> on 2011/06/04 20:44:47 UTC, 0 replies.
- [VOTE] Apache Nutch 1.3 Release Candidate #3 - posted by "Mattmann, Chris A (388J)" <ch...@jpl.nasa.gov> on 2011/06/04 21:03:02 UTC, 3 replies.
- Build failed in Jenkins: Nutch-trunk #1508 - posted by Apache Jenkins Server <hu...@hudson.apache.org> on 2011/06/05 06:12:14 UTC, 0 replies.
- Build failed in Jenkins: Nutch-trunk #1509 - posted by Apache Jenkins Server <hu...@hudson.apache.org> on 2011/06/06 06:01:19 UTC, 0 replies.
- Build failed in Jenkins: Nutch-trunk #1510 - posted by Apache Jenkins Server <je...@builds.apache.org> on 2011/06/07 06:03:16 UTC, 0 replies.
- [jira] [Created] (NUTCH-1004) Do not index empty values for title field - posted by "Markus Jelsma (JIRA)" <ji...@apache.org> on 2011/06/07 15:27:58 UTC, 0 replies.
- [jira] [Created] (NUTCH-1005) Index headings h1 and h2 - posted by "Markus Jelsma (JIRA)" <ji...@apache.org> on 2011/06/07 15:39:58 UTC, 0 replies.
- [jira] [Updated] (NUTCH-1005) Index headings h1 and h2 - posted by "Markus Jelsma (JIRA)" <ji...@apache.org> on 2011/06/07 15:41:58 UTC, 0 replies.
- [jira] [Updated] (NUTCH-1004) Do not index empty values for title field - posted by "Markus Jelsma (JIRA)" <ji...@apache.org> on 2011/06/07 15:45:58 UTC, 4 replies.
- dated issues in JIRA - posted by lewis john mcgibbney <le...@gmail.com> on 2011/06/07 17:01:12 UTC, 3 replies.
- [jira] [Commented] (NUTCH-623) Change plugin source directory "languageidentifier" to "language-identifier" - posted by "Lewis John McGibbney (JIRA)" <ji...@apache.org> on 2011/06/07 18:35:58 UTC, 0 replies.
- [jira] [Created] (NUTCH-1006) meta equiv with single quotes not accepted - posted by "Markus Jelsma (JIRA)" <ji...@apache.org> on 2011/06/07 22:05:58 UTC, 0 replies.
- [jira] [Closed] (NUTCH-1002) Want to be able to filter url's through code, rather than through configuration file - crawl-urlfilter.txt - posted by "Markus Jelsma (JIRA)" <ji...@apache.org> on 2011/06/07 22:21:58 UTC, 0 replies.
- [jira] [Assigned] (NUTCH-1000) Add option not to commit to Solr - posted by "Markus Jelsma (JIRA)" <ji...@apache.org> on 2011/06/07 22:27:59 UTC, 0 replies.
- [RESULT] [VOTE] Apache Nutch 1.3 Release Candidate #3 - posted by "Mattmann, Chris A (388J)" <ch...@jpl.nasa.gov> on 2011/06/08 05:01:11 UTC, 2 replies.
- Build failed in Jenkins: Nutch-trunk #1511 - posted by Apache Jenkins Server <je...@builds.apache.org> on 2011/06/08 06:02:38 UTC, 0 replies.
- [ANNOUNCE] Apache Nutch 1.3 released - posted by "Mattmann, Chris A (388J)" <ch...@jpl.nasa.gov> on 2011/06/08 06:03:06 UTC, 0 replies.
- Updating Wiki entries - posted by lewis john mcgibbney <le...@gmail.com> on 2011/06/08 12:26:42 UTC, 1 replies.
- [jira] [Closed] (NUTCH-837) Remove search servers and Lucene dependencies - posted by "Julien Nioche (JIRA)" <ji...@apache.org> on 2011/06/08 23:31:59 UTC, 0 replies.
- [jira] [Updated] (NUTCH-386) Plugin to index categories by url rules - posted by "Julien Nioche (JIRA)" <ji...@apache.org> on 2011/06/08 23:31:59 UTC, 0 replies.
- Build failed in Jenkins: Nutch-trunk #1512 - posted by Apache Jenkins Server <je...@builds.apache.org> on 2011/06/09 06:03:33 UTC, 0 replies.
- 'Other Resources' section of wiki - posted by lewis john mcgibbney <le...@gmail.com> on 2011/06/09 14:07:14 UTC, 1 replies.
- Build failed in Jenkins: Nutch-trunk #1513 - posted by Apache Jenkins Server <je...@builds.apache.org> on 2011/06/10 06:04:58 UTC, 0 replies.
- [jira] [Closed] (NUTCH-872) Change the default fetcher.parse to FALSE - posted by "Markus Jelsma (JIRA)" <ji...@apache.org> on 2011/06/10 12:14:58 UTC, 0 replies.
- [jira] [Updated] (NUTCH-1006) meta equiv with single quotes not accepted - posted by "Markus Jelsma (JIRA)" <ji...@apache.org> on 2011/06/10 12:16:59 UTC, 3 replies.
- [jira] [Closed] (NUTCH-983) Upgrade SolrJ - posted by "Markus Jelsma (JIRA)" <ji...@apache.org> on 2011/06/10 12:17:00 UTC, 0 replies.
- [jira] [Updated] (NUTCH-1000) Add option not to commit to Solr - posted by "Markus Jelsma (JIRA)" <ji...@apache.org> on 2011/06/10 12:17:00 UTC, 3 replies.
- [jira] [Closed] (NUTCH-967) Upgrade to Tika 0.9 - posted by "Markus Jelsma (JIRA)" <ji...@apache.org> on 2011/06/10 12:18:58 UTC, 0 replies.
- [jira] [Commented] (NUTCH-961) Expose Tika's boilerpipe support - posted by "Gabriele Kahlout (JIRA)" <ji...@apache.org> on 2011/06/10 12:28:58 UTC, 4 replies.
- new branch 1.4 and possible features - posted by Julien Nioche <li...@gmail.com> on 2011/06/10 12:55:03 UTC, 4 replies.
- Build failed in Jenkins: Nutch-trunk #1514 - posted by Apache Jenkins Server <je...@builds.apache.org> on 2011/06/11 06:12:20 UTC, 0 replies.
- Parse application/xhtml+xml error - posted by blue-wolf Yang <bl...@gmail.com> on 2011/06/11 07:07:21 UTC, 1 replies.
- Nutch Compiler Problem - posted by "Ali M. Areshey" <aa...@kacst.edu.sa> on 2011/06/11 18:26:43 UTC, 0 replies.
- Re: Nutch Compiler Problem - posted by Markus Jelsma <ma...@openindex.io> on 2011/06/11 18:50:08 UTC, 0 replies.
- [Nutch Wiki] Update of "Presentations" by parker20121 - posted by Apache Wiki <wi...@apache.org> on 2011/06/11 21:48:20 UTC, 2 replies.
- Build failed in Jenkins: Nutch-trunk #1515 - posted by Apache Jenkins Server <je...@builds.apache.org> on 2011/06/12 06:07:23 UTC, 0 replies.
- Please remove me from the mailing list - posted by Tolga Soyata <to...@gmail.com> on 2011/06/12 15:33:21 UTC, 1 replies.
- [jira] [Commented] (NUTCH-993) NullPointerException at FetcherOutputFormat.checkOutputSpecs - posted by "Paper Cruncher (JIRA)" <ji...@apache.org> on 2011/06/13 02:53:51 UTC, 2 replies.
- Bug-fix for Nutch 1.3 with solrdedup - posted by Yavinty <ya...@gmail.com> on 2011/06/13 05:20:17 UTC, 1 replies.
- Build failed in Jenkins: Nutch-trunk #1516 - posted by Apache Jenkins Server <je...@builds.apache.org> on 2011/06/13 06:03:26 UTC, 0 replies.
- [Nutch Wiki] Trivial Update of "FrontPage" by LewisJohnMcgibbney - posted by Apache Wiki <wi...@apache.org> on 2011/06/13 12:57:27 UTC, 32 replies.
- [Nutch Wiki] Trivial Update of "Support" by LewisJohnMcgibbney - posted by Apache Wiki <wi...@apache.org> on 2011/06/13 13:56:57 UTC, 1 replies.
- [Nutch Wiki] Trivial Update of "DownloadingNutch" by LewisJohnMcgibbney - posted by Apache Wiki <wi...@apache.org> on 2011/06/13 14:00:12 UTC, 3 replies.
- [Nutch Wiki] Trivial Update of "Archive" by LewisJohnMcgibbney - posted by Apache Wiki <wi...@apache.org> on 2011/06/13 15:02:38 UTC, 6 replies.
- [Nutch Wiki] Trivial Update of "RunningNutchAndSolr" by LewisJohnMcgibbney - posted by Apache Wiki <wi...@apache.org> on 2011/06/13 15:37:45 UTC, 11 replies.
- [jira] [Created] (NUTCH-1007) Add readdb -host output - posted by "MilleBii (JIRA)" <ji...@apache.org> on 2011/06/13 19:30:51 UTC, 0 replies.
- [jira] [Updated] (NUTCH-1007) Add readdb -host output - posted by "MilleBii (JIRA)" <ji...@apache.org> on 2011/06/13 19:40:51 UTC, 0 replies.
- Build failed in Jenkins: Nutch-trunk #1517 - posted by Apache Jenkins Server <je...@builds.apache.org> on 2011/06/14 06:04:54 UTC, 0 replies.
- [Nutch Wiki] Trivial Update of "PluginCentral" by LewisJohnMcgibbney - posted by Apache Wiki <wi...@apache.org> on 2011/06/14 18:50:36 UTC, 0 replies.
- I would like to subscribe to the nutch developer mailing list - posted by nikhil murali <ni...@gmail.com> on 2011/06/14 19:08:12 UTC, 0 replies.
- [Nutch Wiki] Update of "CommandLineOptions" by LewisJohnMcgibbney - posted by Apache Wiki <wi...@apache.org> on 2011/06/14 22:56:34 UTC, 0 replies.
- [Nutch Wiki] Trivial Update of "DistributedWebDB" by LewisJohnMcgibbney - posted by Apache Wiki <wi...@apache.org> on 2011/06/16 18:00:50 UTC, 0 replies.
- [Nutch Wiki] Trivial Update of "InternalDocumentation" by LewisJohnMcgibbney - posted by Apache Wiki <wi...@apache.org> on 2011/06/16 18:16:04 UTC, 1 replies.
- [Nutch Wiki] Trivial Update of "Archive and Legacy" by LewisJohnMcgibbney - posted by Apache Wiki <wi...@apache.org> on 2011/06/16 18:18:30 UTC, 2 replies.
- Build failed in Jenkins: Nutch-trunk #1519 - posted by Apache Jenkins Server <je...@builds.apache.org> on 2011/06/17 06:45:06 UTC, 0 replies.
- [jira] [Created] (NUTCH-1008) Switch to crawler-commons version of robots.txt parsing code - posted by "Ken Krugler (JIRA)" <ji...@apache.org> on 2011/06/17 20:12:47 UTC, 0 replies.
- Build failed in Jenkins: Nutch-trunk #1520 - posted by Apache Jenkins Server <je...@builds.apache.org> on 2011/06/19 06:14:03 UTC, 0 replies.
- [Nutch Wiki] Trivial Update of "FAQ" by wentforgold - posted by Apache Wiki <wi...@apache.org> on 2011/06/19 08:38:14 UTC, 0 replies.
- [Nutch Wiki] Trivial Update of "FAQ" by MarkusJelsma - posted by Apache Wiki <wi...@apache.org> on 2011/06/19 13:29:16 UTC, 0 replies.
- [Nutch Wiki] Update of "FAQ" by Jeff Moszuti - posted by Apache Wiki <wi...@apache.org> on 2011/06/19 14:41:22 UTC, 0 replies.
- how to classify the search results by an indexed field with lucene? - posted by Joey <ma...@gmail.com> on 2011/06/20 03:56:35 UTC, 0 replies.
- Build failed in Jenkins: Nutch-trunk #1521 - posted by Apache Jenkins Server <je...@builds.apache.org> on 2011/06/20 06:02:40 UTC, 0 replies.
- URL redirection and zero scores - posted by Nutch User - 1 <nu...@gmail.com> on 2011/06/20 10:16:01 UTC, 0 replies.
- [jira] [Commented] (NUTCH-999) Normalise String representation for Dates in IndexingFilters - posted by "Markus Jelsma (JIRA)" <ji...@apache.org> on 2011/06/20 18:46:47 UTC, 1 replies.
- [jira] [Commented] (NUTCH-802) Problems managing outlinks with large url length - posted by "Markus Jelsma (JIRA)" <ji...@apache.org> on 2011/06/20 19:42:47 UTC, 1 replies.
- [jira] [Commented] (NUTCH-578) URL fetched with 403 is generated over and over again - posted by "Markus Jelsma (JIRA)" <ji...@apache.org> on 2011/06/20 19:42:47 UTC, 0 replies.
- [jira] [Commented] (NUTCH-968) Crawling - File Error 404 when fetching file with an chinese word in the file name - posted by "Markus Jelsma (JIRA)" <ji...@apache.org> on 2011/06/20 19:50:47 UTC, 0 replies.
- [jira] [Created] (NUTCH-1009) Incorrect path when using readdb - posted by "Gabriele Kahlout (JIRA)" <ji...@apache.org> on 2011/06/20 19:58:47 UTC, 0 replies.
- [jira] [Closed] (NUTCH-1009) Incorrect path when using readdb - posted by "Markus Jelsma (JIRA)" <ji...@apache.org> on 2011/06/20 20:38:48 UTC, 0 replies.
- [jira] [Commented] (NUTCH-1000) Add option not to commit to Solr - posted by "Markus Jelsma (JIRA)" <ji...@apache.org> on 2011/06/20 23:36:47 UTC, 4 replies.
- Build failed in Jenkins: Nutch-trunk #1522 - posted by Apache Jenkins Server <je...@builds.apache.org> on 2011/06/21 11:26:57 UTC, 0 replies.
- [jira] [Issue Comment Edited] (NUTCH-1006) meta equiv with single quotes not accepted - posted by "Markus Jelsma (JIRA)" <ji...@apache.org> on 2011/06/21 14:55:48 UTC, 0 replies.
- TestFetcher hangs - posted by Nutch User - 1 <nu...@gmail.com> on 2011/06/21 16:55:20 UTC, 0 replies.
- [jira] [Commented] (NUTCH-1004) Do not index empty values for title field - posted by "Markus Jelsma (JIRA)" <ji...@apache.org> on 2011/06/22 12:28:47 UTC, 0 replies.
- [jira] [Created] (NUTCH-1010) ContentLength not trimmed - posted by "Markus Jelsma (JIRA)" <ji...@apache.org> on 2011/06/22 13:06:47 UTC, 0 replies.
- [jira] [Updated] (NUTCH-1010) ContentLength not trimmed - posted by "Markus Jelsma (JIRA)" <ji...@apache.org> on 2011/06/22 13:08:47 UTC, 2 replies.
- Building Nutch 2.0 from the trunk - posted by Nutch User - 1 <nu...@gmail.com> on 2011/06/22 13:50:48 UTC, 0 replies.
- [jira] [Reopened] (NUTCH-999) Normalise String representation for Dates in IndexingFilters - posted by "Markus Jelsma (JIRA)" <ji...@apache.org> on 2011/06/22 15:06:47 UTC, 0 replies.
- [jira] [Resolved] (NUTCH-999) Normalise String representation for Dates in IndexingFilters - posted by "Markus Jelsma (JIRA)" <ji...@apache.org> on 2011/06/22 15:10:47 UTC, 0 replies.
- [jira] [Commented] (NUTCH-1006) meta equiv with single quotes not accepted - posted by "Markus Jelsma (JIRA)" <ji...@apache.org> on 2011/06/22 15:12:47 UTC, 1 replies.
- [jira] [Commented] (NUTCH-1010) ContentLength not trimmed - posted by "Markus Jelsma (JIRA)" <ji...@apache.org> on 2011/06/22 15:14:47 UTC, 1 replies.
- Build failed in Jenkins: Nutch-trunk #1524 - posted by Apache Jenkins Server <je...@builds.apache.org> on 2011/06/23 06:05:23 UTC, 0 replies.
- [jira] [Created] (NUTCH-1011) Remove double slashes - posted by "Markus Jelsma (JIRA)" <ji...@apache.org> on 2011/06/23 15:47:47 UTC, 0 replies.
- [jira] [Updated] (NUTCH-1011) Remove double slashes - posted by "Markus Jelsma (JIRA)" <ji...@apache.org> on 2011/06/23 15:49:47 UTC, 0 replies.
- [jira] [Updated] (NUTCH-1011) Normalize duplicate slashes in URL's - posted by "Markus Jelsma (JIRA)" <ji...@apache.org> on 2011/06/23 15:51:47 UTC, 5 replies.
- [jira] [Issue Comment Edited] (NUTCH-1011) Normalize duplicate slashes in URL's - posted by "Markus Jelsma (JIRA)" <ji...@apache.org> on 2011/06/23 17:55:47 UTC, 0 replies.
- [jira] [Commented] (NUTCH-1011) Normalize duplicate slashes in URL's - posted by "Markus Jelsma (JIRA)" <ji...@apache.org> on 2011/06/23 18:37:47 UTC, 1 replies.
- [jira] [Created] (NUTCH-1012) Cannot handle illegal charset $charset - posted by "Markus Jelsma (JIRA)" <ji...@apache.org> on 2011/06/24 01:30:47 UTC, 0 replies.
- [Nutch Wiki] Trivial Update of "Presentations" by LewisJohnMcgibbney - posted by Apache Wiki <wi...@apache.org> on 2011/06/24 03:22:07 UTC, 0 replies.
- Build failed in Jenkins: Nutch-trunk #1525 - posted by Apache Jenkins Server <je...@builds.apache.org> on 2011/06/24 06:03:30 UTC, 0 replies.
- [jira] [Created] (NUTCH-1013) Migrate RegexURLNormalizer from Apache ORO to java.util.regex - posted by "Markus Jelsma (JIRA)" <ji...@apache.org> on 2011/06/24 15:16:47 UTC, 0 replies.
- [jira] [Updated] (NUTCH-1013) Migrate RegexURLNormalizer from Apache ORO to java.util.regex - posted by "Markus Jelsma (JIRA)" <ji...@apache.org> on 2011/06/24 15:18:47 UTC, 0 replies.
- [jira] [Issue Comment Edited] (NUTCH-1013) Migrate RegexURLNormalizer from Apache ORO to java.util.regex - posted by "Markus Jelsma (JIRA)" <ji...@apache.org> on 2011/06/24 15:20:47 UTC, 1 replies.
- [jira] [Assigned] (NUTCH-1006) meta equiv with single quotes not accepted - posted by "Markus Jelsma (JIRA)" <ji...@apache.org> on 2011/06/24 15:45:47 UTC, 0 replies.
- [jira] [Assigned] (NUTCH-1004) Do not index empty values for title field - posted by "Markus Jelsma (JIRA)" <ji...@apache.org> on 2011/06/24 15:45:47 UTC, 0 replies.
- [jira] [Updated] (NUTCH-1012) Cannot handle illegal charset $charset - posted by "Markus Jelsma (JIRA)" <ji...@apache.org> on 2011/06/24 15:47:49 UTC, 3 replies.
- [jira] [Resolved] (NUTCH-1010) ContentLength not trimmed - posted by "Markus Jelsma (JIRA)" <ji...@apache.org> on 2011/06/24 15:57:47 UTC, 0 replies.
- [jira] [Commented] (NUTCH-1013) Migrate RegexURLNormalizer from Apache ORO to java.util.regex - posted by "Ken Krugler (JIRA)" <ji...@apache.org> on 2011/06/24 16:35:47 UTC, 4 replies.
- [jira] [Commented] (NUTCH-1012) Cannot handle illegal charset $charset - posted by "Ken Krugler (JIRA)" <ji...@apache.org> on 2011/06/24 16:37:47 UTC, 3 replies.
- [jira] [Resolved] (NUTCH-1006) meta equiv with single quotes not accepted - posted by "Markus Jelsma (JIRA)" <ji...@apache.org> on 2011/06/24 16:39:47 UTC, 0 replies.
- [jira] [Created] (NUTCH-1014) Migrate from Apache ORO to java.util.regex - posted by "Markus Jelsma (JIRA)" <ji...@apache.org> on 2011/06/24 17:45:47 UTC, 0 replies.
- [jira] [Issue Comment Edited] (NUTCH-1012) Cannot handle illegal charset $charset - posted by "Markus Jelsma (JIRA)" <ji...@apache.org> on 2011/06/24 18:13:47 UTC, 0 replies.
- [jira] [Updated] (NUTCH-987) Support HTTP auth for Solr communication - posted by "Markus Jelsma (JIRA)" <ji...@apache.org> on 2011/06/24 22:45:47 UTC, 0 replies.
- Build failed in Jenkins: Nutch-trunk #1526 - posted by Apache Jenkins Server <je...@builds.apache.org> on 2011/06/25 06:41:16 UTC, 0 replies.
- [jira] [Updated] (NUTCH-965) Skip parsing for truncated documents - posted by "Markus Jelsma (JIRA)" <ji...@apache.org> on 2011/06/25 14:48:47 UTC, 0 replies.
- [jira] [Updated] (NUTCH-717) Make Nutch Solr integration easier - posted by "Markus Jelsma (JIRA)" <ji...@apache.org> on 2011/06/25 14:50:48 UTC, 0 replies.
- [jira] [Closed] (NUTCH-939) Added -dir command line option to Indexer and SolrIndexer, allowing to specify directory containing segments - posted by "Markus Jelsma (JIRA)" <ji...@apache.org> on 2011/06/25 14:54:48 UTC, 0 replies.
- [jira] [Closed] (NUTCH-824) Crawling - File Error 404 when fetching file with an hexadecimal character in the file name. - posted by "Markus Jelsma (JIRA)" <ji...@apache.org> on 2011/06/25 14:54:48 UTC, 0 replies.
- [jira] [Closed] (NUTCH-984) Parse-tika throws some URL's away - posted by "Markus Jelsma (JIRA)" <ji...@apache.org> on 2011/06/25 14:54:48 UTC, 0 replies.
- [jira] [Closed] (NUTCH-995) Generate POM file using the Ivy makepom task - posted by "Markus Jelsma (JIRA)" <ji...@apache.org> on 2011/06/25 14:54:48 UTC, 0 replies.
- [jira] [Closed] (NUTCH-957) fetcher.timelimit.mins is invalid when depth is greater than 1 - posted by "Markus Jelsma (JIRA)" <ji...@apache.org> on 2011/06/25 14:54:49 UTC, 0 replies.
- [jira] [Closed] (NUTCH-975) Fix missing/wrong headers in source files - posted by "Markus Jelsma (JIRA)" <ji...@apache.org> on 2011/06/25 14:54:49 UTC, 0 replies.
- [jira] [Closed] (NUTCH-962) max. redirects not handled correctly: fetcher stops at max-1 redirects - posted by "Markus Jelsma (JIRA)" <ji...@apache.org> on 2011/06/25 14:54:49 UTC, 0 replies.
- [jira] [Closed] (NUTCH-1003) 'package' task does not reflect the new organisation of the code - posted by "Markus Jelsma (JIRA)" <ji...@apache.org> on 2011/06/25 14:54:49 UTC, 0 replies.
- [jira] [Closed] (NUTCH-997) IndexingFitlers to store Date objects instead of Strings - posted by "Markus Jelsma (JIRA)" <ji...@apache.org> on 2011/06/25 14:54:49 UTC, 0 replies.
- [jira] [Closed] (NUTCH-994) Fine tune Solr schema - posted by "Markus Jelsma (JIRA)" <ji...@apache.org> on 2011/06/25 14:54:49 UTC, 0 replies.
- [jira] [Closed] (NUTCH-972) Mergedb doesn't merge with empty directory, as is the case with merge (for indexes) - posted by "Markus Jelsma (JIRA)" <ji...@apache.org> on 2011/06/25 14:54:49 UTC, 0 replies.
- [jira] [Closed] (NUTCH-948) Remove Lucene dependencies - posted by "Markus Jelsma (JIRA)" <ji...@apache.org> on 2011/06/25 14:54:49 UTC, 0 replies.
- [jira] [Closed] (NUTCH-954) Bugfix for Content-Length limit in http protocols - posted by "Markus Jelsma (JIRA)" <ji...@apache.org> on 2011/06/25 14:54:49 UTC, 0 replies.
- [jira] [Updated] (NUTCH-956) solrindex issues - posted by "Markus Jelsma (JIRA)" <ji...@apache.org> on 2011/06/25 14:59:47 UTC, 0 replies.
- [jira] [Updated] (NUTCH-295) More description for fetcher.threads.fetch property - posted by "Markus Jelsma (JIRA)" <ji...@apache.org> on 2011/06/25 15:01:47 UTC, 0 replies.
- Build failed in Jenkins: Nutch-trunk #1527 - posted by Apache Jenkins Server <je...@builds.apache.org> on 2011/06/26 06:30:00 UTC, 0 replies.
- Build failed in Jenkins: Nutch-trunk #1528 - posted by Apache Jenkins Server <je...@builds.apache.org> on 2011/06/27 06:02:30 UTC, 0 replies.
- [jira] [Created] (NUTCH-1015) can't parse erroneous date: 2006-05-24T20:03:42 - posted by "Markus Jelsma (JIRA)" <ji...@apache.org> on 2011/06/27 12:30:47 UTC, 0 replies.
- [jira] [Updated] (NUTCH-1015) MoreIndexingFilter: can't parse erroneous date: 2006-05-24T20:03:42 - posted by "Markus Jelsma (JIRA)" <ji...@apache.org> on 2011/06/27 12:34:47 UTC, 2 replies.
- [jira] [Assigned] (NUTCH-295) More description for fetcher.threads.fetch property - posted by "Markus Jelsma (JIRA)" <ji...@apache.org> on 2011/06/27 13:38:47 UTC, 0 replies.
- [jira] [Closed] (NUTCH-295) More description for fetcher.threads.fetch property - posted by "Markus Jelsma (JIRA)" <ji...@apache.org> on 2011/06/27 13:42:48 UTC, 0 replies.
- [jira] [Resolved] (NUTCH-295) More description for fetcher.threads.fetch property - posted by "Markus Jelsma (JIRA)" <ji...@apache.org> on 2011/06/27 13:42:48 UTC, 0 replies.
- [jira] [Commented] (NUTCH-956) solrindex issues - posted by "Markus Jelsma (JIRA)" <ji...@apache.org> on 2011/06/27 14:28:47 UTC, 0 replies.
- [jira] [Issue Comment Edited] (NUTCH-961) Expose Tika's boilerpipe support - posted by "Markus Jelsma (JIRA)" <ji...@apache.org> on 2011/06/27 15:17:47 UTC, 0 replies.
- [jira] [Updated] (NUTCH-1016) Strip UTF-8 non-character codepoints - posted by "Markus Jelsma (JIRA)" <ji...@apache.org> on 2011/06/27 17:47:47 UTC, 9 replies.
- [jira] [Created] (NUTCH-1016) Strip UTF-8 non-character codepoints - posted by "Markus Jelsma (JIRA)" <ji...@apache.org> on 2011/06/27 17:47:47 UTC, 0 replies.
- [jira] [Created] (NUTCH-1017) Exception getting mime type by name - posted by "Markus Jelsma (JIRA)" <ji...@apache.org> on 2011/06/27 21:29:47 UTC, 0 replies.
- [jira] [Created] (NUTCH-1018) Solr Document Size Limit - posted by "Mark Achee (JIRA)" <ji...@apache.org> on 2011/06/27 22:01:48 UTC, 0 replies.
- [jira] [Commented] (NUTCH-1018) Solr Document Size Limit - posted by "Markus Jelsma (JIRA)" <ji...@apache.org> on 2011/06/27 22:23:47 UTC, 2 replies.
- [Nutch Wiki] Update of "bin/nutch_crawl" by LewisJohnMcgibbney - posted by Apache Wiki <wi...@apache.org> on 2011/06/28 04:12:59 UTC, 0 replies.
- [Nutch Wiki] Trivial Update of "bin/nutch_crawl" by LewisJohnMcgibbney - posted by Apache Wiki <wi...@apache.org> on 2011/06/28 04:14:56 UTC, 5 replies.
- [jira] [Updated] (NUTCH-1019) Edit comment in org.apache.nutch.crawl.Crawl to reflect removal of legacy - posted by "Lewis John McGibbney (JIRA)" <ji...@apache.org> on 2011/06/28 04:22:17 UTC, 0 replies.
- [jira] [Created] (NUTCH-1019) Edit comment in org.apache.nutc.crawl.Crawl to reflect removal of legacy - posted by "Lewis John McGibbney (JIRA)" <ji...@apache.org> on 2011/06/28 04:22:17 UTC, 0 replies.
- Build failed in Jenkins: Nutch-trunk #1529 - posted by Apache Jenkins Server <je...@builds.apache.org> on 2011/06/28 06:03:36 UTC, 0 replies.
- [Nutch Wiki] Trivial Update of "CommandLineOptions" by LewisJohnMcgibbney - posted by Apache Wiki <wi...@apache.org> on 2011/06/28 08:18:41 UTC, 7 replies.
- [Nutch Wiki] Trivial Update of "bin/nutch_readdb" by LewisJohnMcgibbney - posted by Apache Wiki <wi...@apache.org> on 2011/06/28 08:30:27 UTC, 1 replies.
- [jira] [Commented] (NUTCH-1020) Create or locate class for org.apache.nutch.tools.compat.CrawlDbConverter - posted by "Lewis John McGibbney (JIRA)" <ji...@apache.org> on 2011/06/28 08:46:17 UTC, 4 replies.
- [jira] [Created] (NUTCH-1020) Create or locate class for org.apache.nutch.tools.compat.CrawlDbConverter - posted by "Lewis John McGibbney (JIRA)" <ji...@apache.org> on 2011/06/28 08:46:17 UTC, 0 replies.
- [Nutch Wiki] Trivial Update of "bin/nutch_convdb" by LewisJohnMcgibbney - posted by Apache Wiki <wi...@apache.org> on 2011/06/28 08:47:33 UTC, 0 replies.
- [jira] [Commented] (NUTCH-1019) Edit comment in org.apache.nutch.crawl.Crawl to reflect removal of legacy - posted by "Markus Jelsma (JIRA)" <ji...@apache.org> on 2011/06/28 13:38:17 UTC, 2 replies.
- [jira] [Created] (NUTCH-1021) Migrate OutlinkExtractor from Apache ORO to java.util.regex - posted by "Markus Jelsma (JIRA)" <ji...@apache.org> on 2011/06/28 14:38:17 UTC, 0 replies.
- [jira] [Commented] (NUTCH-1021) Migrate OutlinkExtractor from Apache ORO to java.util.regex - posted by "Markus Jelsma (JIRA)" <ji...@apache.org> on 2011/06/28 15:10:17 UTC, 1 replies.
- [jira] [Updated] (NUTCH-1022) Upgrade version number of Nutch agent in conf - posted by "Markus Jelsma (JIRA)" <ji...@apache.org> on 2011/06/28 15:53:17 UTC, 0 replies.
- [jira] [Created] (NUTCH-1022) Upgrade version number of Nutch agent in conf - posted by "Markus Jelsma (JIRA)" <ji...@apache.org> on 2011/06/28 15:53:17 UTC, 0 replies.
- [jira] [Closed] (NUTCH-1022) Upgrade version number of Nutch agent in conf - posted by "Markus Jelsma (JIRA)" <ji...@apache.org> on 2011/06/28 15:55:17 UTC, 0 replies.
- [jira] [Resolved] (NUTCH-1022) Upgrade version number of Nutch agent in conf - posted by "Markus Jelsma (JIRA)" <ji...@apache.org> on 2011/06/28 15:55:17 UTC, 0 replies.
- [jira] [Issue Comment Edited] (NUTCH-1021) Migrate OutlinkExtractor from Apache ORO to java.util.regex - posted by "Markus Jelsma (JIRA)" <ji...@apache.org> on 2011/06/28 16:15:17 UTC, 0 replies.
- [jira] [Updated] (NUTCH-1021) Migrate OutlinkExtractor from Apache ORO to java.util.regex - posted by "Markus Jelsma (JIRA)" <ji...@apache.org> on 2011/06/28 16:15:17 UTC, 2 replies.
- [jira] [Commented] (NUTCH-1017) Exception getting mime type by name - posted by "Markus Jelsma (JIRA)" <ji...@apache.org> on 2011/06/28 17:11:16 UTC, 0 replies.
- [jira] [Issue Comment Edited] (NUTCH-1017) Exception getting mime type by name - posted by "Markus Jelsma (JIRA)" <ji...@apache.org> on 2011/06/28 17:13:16 UTC, 0 replies.
- [jira] [Resolved] (NUTCH-1012) Cannot handle illegal charset $charset - posted by "Markus Jelsma (JIRA)" <ji...@apache.org> on 2011/06/28 18:04:17 UTC, 0 replies.
- Relevance of -dir parameter in org.apache.nutch.crawl.Crawl - posted by lewis john mcgibbney <le...@gmail.com> on 2011/06/28 22:00:49 UTC, 0 replies.
- [Nutch Wiki] Update of "bin/nutch mergedb" by LewisJohnMcgibbney - posted by Apache Wiki <wi...@apache.org> on 2011/06/28 22:13:57 UTC, 0 replies.
- [Nutch Wiki] Trivial Update of "bin/nutch mergedb" by LewisJohnMcgibbney - posted by Apache Wiki <wi...@apache.org> on 2011/06/28 22:17:31 UTC, 2 replies.
- [jira] [Commented] (NUTCH-1023) Trivial error in error message for org.apache.nutch.crawl.LinkDbReader - posted by "Lewis John McGibbney (JIRA)" <ji...@apache.org> on 2011/06/28 22:48:17 UTC, 0 replies.
- [jira] [Created] (NUTCH-1023) Trivial error in error message for org.apache.nutch.crawl.LinkDbReader - posted by "Lewis John McGibbney (JIRA)" <ji...@apache.org> on 2011/06/28 22:48:17 UTC, 0 replies.
- [Nutch Wiki] Update of "bin/nutch readlinkdb" by LewisJohnMcgibbney - posted by Apache Wiki <wi...@apache.org> on 2011/06/28 22:59:17 UTC, 0 replies.
- [Nutch Wiki] Trivial Update of "bin/nutch readlinkdb" by LewisJohnMcgibbney - posted by Apache Wiki <wi...@apache.org> on 2011/06/28 23:00:20 UTC, 0 replies.
- [Nutch Wiki] Update of "bin/nutch_readdb" by LewisJohnMcgibbney - posted by Apache Wiki <wi...@apache.org> on 2011/06/28 23:14:32 UTC, 0 replies.
- Build failed in Jenkins: Nutch-trunk #1530 - posted by Apache Jenkins Server <je...@builds.apache.org> on 2011/06/29 06:01:02 UTC, 0 replies.
- [jira] [Commented] (NUTCH-994) Fine tune Solr schema - posted by "Hudson (JIRA)" <ji...@apache.org> on 2011/06/29 06:01:28 UTC, 0 replies.
- [jira] [Commented] (NUTCH-989) index-basic plugin doesn't use Solr date fieldType - posted by "Hudson (JIRA)" <ji...@apache.org> on 2011/06/29 06:01:28 UTC, 0 replies.
- [jira] [Commented] (NUTCH-888) Remove parse-rss - posted by "Hudson (JIRA)" <ji...@apache.org> on 2011/06/29 06:01:29 UTC, 0 replies.
- [jira] [Commented] (NUTCH-983) Upgrade SolrJ - posted by "Hudson (JIRA)" <ji...@apache.org> on 2011/06/29 06:01:29 UTC, 0 replies.
- [jira] [Commented] (NUTCH-967) Upgrade to Tika 0.9 - posted by "Hudson (JIRA)" <ji...@apache.org> on 2011/06/29 06:01:29 UTC, 0 replies.
- [jira] [Commented] (NUTCH-991) SolrDedup must issue a commit - posted by "Hudson (JIRA)" <ji...@apache.org> on 2011/06/29 06:01:29 UTC, 0 replies.
- [jira] [Commented] (NUTCH-986) Dedup fails due to date format (long) - posted by "Hudson (JIRA)" <ji...@apache.org> on 2011/06/29 06:01:29 UTC, 0 replies.
- [ANNOUNCEMENT] Lewis John Mc Gibbney is a Nutch committer and PMC member - posted by Julien Nioche <li...@gmail.com> on 2011/06/29 10:06:52 UTC, 3 replies.
- [Nutch Wiki] Update of "Presentations" by JulienNioche - posted by Apache Wiki <wi...@apache.org> on 2011/06/29 12:25:35 UTC, 0 replies.
- [jira] [Reopened] (NUTCH-872) Change the default fetcher.parse to FALSE - posted by "Markus Jelsma (JIRA)" <ji...@apache.org> on 2011/06/29 21:16:28 UTC, 0 replies.
- [jira] [Updated] (NUTCH-872) Change the default fetcher.parse to FALSE - posted by "Markus Jelsma (JIRA)" <ji...@apache.org> on 2011/06/29 21:16:28 UTC, 0 replies.
- [jira] [Updated] (NUTCH-578) URL fetched with 403 is generated over and over again - posted by "Markus Jelsma (JIRA)" <ji...@apache.org> on 2011/06/29 21:18:28 UTC, 0 replies.
- [jira] [Commented] (NUTCH-1016) Strip UTF-8 non-character codepoints - posted by "Markus Jelsma (JIRA)" <ji...@apache.org> on 2011/06/29 21:36:28 UTC, 0 replies.
- [jira] [Created] (NUTCH-1024) Dynamically set fetchInterval by MIME-type - posted by "Markus Jelsma (JIRA)" <ji...@apache.org> on 2011/06/30 01:23:28 UTC, 0 replies.
- Build failed in Jenkins: Nutch-trunk #1531 - posted by Apache Jenkins Server <je...@builds.apache.org> on 2011/06/30 06:01:28 UTC, 0 replies.
- [Nutch Wiki] Update of "bin/nutch_inject" by LewisJohnMcgibbney - posted by Apache Wiki <wi...@apache.org> on 2011/06/30 07:03:41 UTC, 0 replies.
- [Nutch Wiki] Trivial Update of "bin/nutch_inject" by LewisJohnMcgibbney - posted by Apache Wiki <wi...@apache.org> on 2011/06/30 07:04:42 UTC, 0 replies.
- [jira] [Updated] (NUTCH-993) NullPointerException at FetcherOutputFormat.checkOutputSpecs - posted by "Markus Jelsma (JIRA)" <ji...@apache.org> on 2011/06/30 11:30:28 UTC, 0 replies.
- [jira] [Resolved] (NUTCH-1016) Strip UTF-8 non-character codepoints - posted by "Markus Jelsma (JIRA)" <ji...@apache.org> on 2011/06/30 14:14:28 UTC, 1 replies.
- [jira] [Reopened] (NUTCH-1016) Strip UTF-8 non-character codepoints - posted by "Markus Jelsma (JIRA)" <ji...@apache.org> on 2011/06/30 14:24:28 UTC, 0 replies.
- Create separate issues for 2.0? - posted by Markus Jelsma <ma...@openindex.io> on 2011/06/30 15:25:24 UTC, 5 replies.
- [jira] [Created] (NUTCH-1025) Add option not to commit to Solr - posted by "Markus Jelsma (JIRA)" <ji...@apache.org> on 2011/06/30 18:17:28 UTC, 0 replies.
- [jira] [Created] (NUTCH-1026) Strip UTF-8 non-character codepoints - posted by "Markus Jelsma (JIRA)" <ji...@apache.org> on 2011/06/30 18:19:28 UTC, 0 replies.
- [jira] [Resolved] (NUTCH-1000) Add option not to commit to Solr - posted by "Markus Jelsma (JIRA)" <ji...@apache.org> on 2011/06/30 18:19:28 UTC, 0 replies.