You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@nutch.apache.org by Piotr Kosiorowski <pk...@gmail.com> on 2006/01/01 20:46:25 UTC

Re: Mega-cleanup in trunk/

Andrzej Bialecki wrote:
> Hi,
> 
> I just commited a large patch to cleanup the trunk/ of obsolete and 
> broken classes remaining from the 0.7.x development line. Please test 
> that things still work as they should ...
> 
Hi,
I am not sure what is wrong but a lot of JUnit test simply does not 
compile - I did svn checkout to new directory to be sure I do not 
anything left from my experiments.

I am looking at it right now but - I would suggest to temporarily do a 
quick cleanup to make trunk testable:

1) Remove permanently - as classes under tests are removed in trunk:
	src/test/org/apache/nutch/pagedb/TestFetchListEntry.java
	src/test/org/apache/nutch/pagedb/TestPage.java
	src/test/org/apache/nutch/db/TestWebDB.java
	src/test/org/apache/nutch/db/DBTester.java
	src/test/org/apache/nutch/tools/TestSegmentMergeTool.java
2) Remove temporarly and create JIRA issue to fix it:
	src/test/org/apache/nutch/fetcher/TestFetcher.java
	src/test/org/apache/nutch/fetcher/TestFetcherOutput.java

3) Remove unused import in:
		src/test/org/apache/nutch/parse/TestParseText.java
4) Fix (as it looks simple to fix it - I will look at it in meantime):

src/plugin/parse-msword/src/test/org/apache/nutch/parse/msword/TestMSWordParser.java
src/plugin/parse-zip/src/test/org/apache/nutch/parse/zip/TestZipParser.java
src/plugin/parse-rss/src/test/org/apache/nutch/parse/rss/TestRSSParser.java
src/plugin/parse-pdf/src/test/org/apache/nutch/parse/pdf/TestPdfParser.java
src/plugin/parse-ext/src/test/org/apache/nutch/parse/ext/TestExtParser.java
src/plugin/parse-mspowerpoint/src/test/org/apache/nutch/parse/mspowerpoint/TestMSPowerPointParser.java
src/plugin/parse-mspowerpoint/src/test/org/apache/nutch/parse/mspowerpoint/AllTests.java

After removal of all these not compiling classes tests of trunk complete 
sucessfully on my machine (JDK 1.4.2).

If no objections - especially from Andrzej would be raised I can do the 
cleanup tommorow.
P.



Re: Mega-cleanup in trunk/

Posted by Andrzej Bialecki <ab...@getopt.org>.
Piotr Kosiorowski wrote:

> Andrzej Bialecki wrote:
>
>> Hi,
>>
>> I just commited a large patch to cleanup the trunk/ of obsolete and 
>> broken classes remaining from the 0.7.x development line. Please test 
>> that things still work as they should ...
>>
> Hi,
> I am not sure what is wrong but a lot of JUnit test simply does not 
> compile - I did svn checkout to new directory to be sure I do not 
> anything left from my experiments.


Yes, you are right - I would welcome any help, I'm a bit tight on time...

>
> I am looking at it right now but - I would suggest to temporarily do a 
> quick cleanup to make trunk testable:
>

Agreed.


> 3) Remove unused import in:
>         src/test/org/apache/nutch/parse/TestParseText.java


Ok.

> 4) Fix (as it looks simple to fix it - I will look at it in meantime):
>
> src/plugin/parse-msword/src/test/org/apache/nutch/parse/msword/TestMSWordParser.java 
>
> src/plugin/parse-zip/src/test/org/apache/nutch/parse/zip/TestZipParser.java 
>
> src/plugin/parse-rss/src/test/org/apache/nutch/parse/rss/TestRSSParser.java 
>
> src/plugin/parse-pdf/src/test/org/apache/nutch/parse/pdf/TestPdfParser.java 
>
> src/plugin/parse-ext/src/test/org/apache/nutch/parse/ext/TestExtParser.java 
>
> src/plugin/parse-mspowerpoint/src/test/org/apache/nutch/parse/mspowerpoint/TestMSPowerPointParser.java 
>
> src/plugin/parse-mspowerpoint/src/test/org/apache/nutch/parse/mspowerpoint/AllTests.java 
>


Yes, they are just one-line fixes. I removed the 
getProtocolContent(urlString) methods, you need to replace them with 
getProtocolContent(new UTF8(urlString), new CrawlDatum()).

>
> After removal of all these not compiling classes tests of trunk 
> complete sucessfully on my machine (JDK 1.4.2).
>
> If no objections - especially from Andrzej would be raised I can do 
> the cleanup tommorow.


Your help would be most welcome, no objections here.

-- 
Best regards,
Andrzej Bialecki     <><
 ___. ___ ___ ___ _ _   __________________________________
[__ || __|__/|__||\/|  Information Retrieval, Semantic Web
___|||__||  \|  ||  |  Embedded Unix, System Integration
http://www.sigram.com  Contact: info at sigram dot com