You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@nutch.apache.org by "Lewis John McGibbney (JIRA)" <ji...@apache.org> on 2014/06/12 16:07:02 UTC

[jira] [Commented] (NUTCH-1792) Refactor resource loading in plugin tests

    [ https://issues.apache.org/jira/browse/NUTCH-1792?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14029163#comment-14029163 ] 

Lewis John McGibbney commented on NUTCH-1792:
---------------------------------------------

BTW this issue relates directly to the fact that Nutch 2.X branch has been failing since

Failed > Console Output  #1015 	May 18, 2014 3:33:19 AM	 
Success > Console Output  #1014 	May 9, 2014 5:35:29 AM

There were no builds on Jenkins between 9th and 18th... this was consistent with problems on builds.apache.org. Here is git log --limit 15

------------------------------------------------------------------------
r1601937 | jnioche | 2014-06-11 11:56:20 -0400 (Wed, 11 Jun 2014) | 1 line

NUTCH-1736 Can't fetch page if http response header contains Transfer-Encoding:chunked
------------------------------------------------------------------------
r1600837 | markus | 2014-06-06 06:01:51 -0400 (Fri, 06 Jun 2014) | 2 lines

NUTCH-1782 NodeWalker to return current node

------------------------------------------------------------------------
r1600599 | jnioche | 2014-06-05 07:09:42 -0400 (Thu, 05 Jun 2014) | 1 line

Fixing blunder in Nutch-1781
------------------------------------------------------------------------
r1600561 | lewismc | 2014-06-04 23:00:10 -0400 (Wed, 04 Jun 2014) | 1 line

NUTCH-1788 Tika may return multiple values for Title on PDF's
------------------------------------------------------------------------
r1600559 | lewismc | 2014-06-04 22:17:14 -0400 (Wed, 04 Jun 2014) | 1 line

Temporary disable TestGoraStore due to GORA-326 Removal of _g_dirty field from _ALL_FIELDS array and Field Enum in Persistent classes
------------------------------------------------------------------------
r1600546 | lewismc | 2014-06-04 20:18:02 -0400 (Wed, 04 Jun 2014) | 1 line

NUTCH-1781 Update gora-*-mapping.xml and gora.proeprties to reflect Gora 0.4
------------------------------------------------------------------------
r1598622 | jnioche | 2014-05-30 10:55:51 -0400 (Fri, 30 May 2014) | 1 line

NUTCH-1768 Upgrade to ElasticSearch 1.1.0
------------------------------------------------------------------------
r1598619 | jnioche | 2014-05-30 10:50:45 -0400 (Fri, 30 May 2014) | 1 line

NUTCH-1634 : readdb -stats shows the result twice
------------------------------------------------------------------------
r1595398 | lewismc | 2014-05-16 20:38:18 -0400 (Fri, 16 May 2014) | 1 line

NUTCH-1780 ttl and gc_grace_seconds attributes are missing from gora-cassandra-mapping.xml file
------------------------------------------------------------------------
r1595196 | jnioche | 2014-05-16 09:40:21 -0400 (Fri, 16 May 2014) | 1 line

NUTCH-1676 Add rudimentary SSL support to protocol-http
------------------------------------------------------------------------
r1594813 | jnioche | 2014-05-15 04:14:38 -0400 (Thu, 15 May 2014) | 1 line

NUTCH-1674 Use batchId filter to enable scan (GORA-119) for Fetch,Parse,Update,Index (Tien Nguyen Manh and Alparslan Avcı via jnioche)
------------------------------------------------------------------------
r1594812 | jnioche | 2014-05-15 04:10:07 -0400 (Thu, 15 May 2014) | 1 line

NUTCH-1714 Nutch 2.x upgrade to Gora 0.4
------------------------------------------------------------------------
r1594071 | snagel | 2014-05-12 15:39:43 -0400 (Mon, 12 May 2014) | 1 line

NUTCH-1752 Cache robots.txt rules per protocol:host:port
------------------------------------------------------------------------
r1593954 | jnioche | 2014-05-12 08:58:41 -0400 (Mon, 12 May 2014) | 1 line

NUTCH-1613 Timeouts in protocol-httpclient when crawling same host with >2 threads
------------------------------------------------------------------------
r1592414 | snagel | 2014-05-04 16:18:50 -0400 (Sun, 04 May 2014) | 1 line

NUTCH-1182 fetcher to log hung threads
------------------------------------------------------------------------



> Refactor resource loading in plugin tests
> -----------------------------------------
>
>                 Key: NUTCH-1792
>                 URL: https://issues.apache.org/jira/browse/NUTCH-1792
>             Project: Nutch
>          Issue Type: Improvement
>          Components: test
>    Affects Versions: 1.8, 2.2.1
>            Reporter: Lewis John McGibbney
>            Assignee: Lewis John McGibbney
>             Fix For: 2.3, 1.9
>
>
> Right now we have a strange method for loading test resources e.g.
> urlString = "file:" + sampleDir + fileSeparator + sampleFiles[i];
> File file = new File(sampleDir + fileSeparator + sampleFiles[i]);
> This works fine from the command line but fails to locate and load the resource within Eclipse IDE... not ideal.
> I am investigating whether we can do getClass().getResource...



--
This message was sent by Atlassian JIRA
(v6.2#6252)