You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@nutch.apache.org by Piotr Kosiorowski <pk...@gmail.com> on 2006/01/01 20:46:25 UTC
Re: Mega-cleanup in trunk/
Andrzej Bialecki wrote:
> Hi,
>
> I just commited a large patch to cleanup the trunk/ of obsolete and
> broken classes remaining from the 0.7.x development line. Please test
> that things still work as they should ...
>
Hi,
I am not sure what is wrong but a lot of JUnit test simply does not
compile - I did svn checkout to new directory to be sure I do not
anything left from my experiments.
I am looking at it right now but - I would suggest to temporarily do a
quick cleanup to make trunk testable:
1) Remove permanently - as classes under tests are removed in trunk:
src/test/org/apache/nutch/pagedb/TestFetchListEntry.java
src/test/org/apache/nutch/pagedb/TestPage.java
src/test/org/apache/nutch/db/TestWebDB.java
src/test/org/apache/nutch/db/DBTester.java
src/test/org/apache/nutch/tools/TestSegmentMergeTool.java
2) Remove temporarly and create JIRA issue to fix it:
src/test/org/apache/nutch/fetcher/TestFetcher.java
src/test/org/apache/nutch/fetcher/TestFetcherOutput.java
3) Remove unused import in:
src/test/org/apache/nutch/parse/TestParseText.java
4) Fix (as it looks simple to fix it - I will look at it in meantime):
src/plugin/parse-msword/src/test/org/apache/nutch/parse/msword/TestMSWordParser.java
src/plugin/parse-zip/src/test/org/apache/nutch/parse/zip/TestZipParser.java
src/plugin/parse-rss/src/test/org/apache/nutch/parse/rss/TestRSSParser.java
src/plugin/parse-pdf/src/test/org/apache/nutch/parse/pdf/TestPdfParser.java
src/plugin/parse-ext/src/test/org/apache/nutch/parse/ext/TestExtParser.java
src/plugin/parse-mspowerpoint/src/test/org/apache/nutch/parse/mspowerpoint/TestMSPowerPointParser.java
src/plugin/parse-mspowerpoint/src/test/org/apache/nutch/parse/mspowerpoint/AllTests.java
After removal of all these not compiling classes tests of trunk complete
sucessfully on my machine (JDK 1.4.2).
If no objections - especially from Andrzej would be raised I can do the
cleanup tommorow.
P.
Re: Mega-cleanup in trunk/
Posted by Andrzej Bialecki <ab...@getopt.org>.
Piotr Kosiorowski wrote:
> Andrzej Bialecki wrote:
>
>> Hi,
>>
>> I just commited a large patch to cleanup the trunk/ of obsolete and
>> broken classes remaining from the 0.7.x development line. Please test
>> that things still work as they should ...
>>
> Hi,
> I am not sure what is wrong but a lot of JUnit test simply does not
> compile - I did svn checkout to new directory to be sure I do not
> anything left from my experiments.
Yes, you are right - I would welcome any help, I'm a bit tight on time...
>
> I am looking at it right now but - I would suggest to temporarily do a
> quick cleanup to make trunk testable:
>
Agreed.
> 3) Remove unused import in:
> src/test/org/apache/nutch/parse/TestParseText.java
Ok.
> 4) Fix (as it looks simple to fix it - I will look at it in meantime):
>
> src/plugin/parse-msword/src/test/org/apache/nutch/parse/msword/TestMSWordParser.java
>
> src/plugin/parse-zip/src/test/org/apache/nutch/parse/zip/TestZipParser.java
>
> src/plugin/parse-rss/src/test/org/apache/nutch/parse/rss/TestRSSParser.java
>
> src/plugin/parse-pdf/src/test/org/apache/nutch/parse/pdf/TestPdfParser.java
>
> src/plugin/parse-ext/src/test/org/apache/nutch/parse/ext/TestExtParser.java
>
> src/plugin/parse-mspowerpoint/src/test/org/apache/nutch/parse/mspowerpoint/TestMSPowerPointParser.java
>
> src/plugin/parse-mspowerpoint/src/test/org/apache/nutch/parse/mspowerpoint/AllTests.java
>
Yes, they are just one-line fixes. I removed the
getProtocolContent(urlString) methods, you need to replace them with
getProtocolContent(new UTF8(urlString), new CrawlDatum()).
>
> After removal of all these not compiling classes tests of trunk
> complete sucessfully on my machine (JDK 1.4.2).
>
> If no objections - especially from Andrzej would be raised I can do
> the cleanup tommorow.
Your help would be most welcome, no objections here.
--
Best regards,
Andrzej Bialecki <><
___. ___ ___ ___ _ _ __________________________________
[__ || __|__/|__||\/| Information Retrieval, Semantic Web
___|||__|| \| || | Embedded Unix, System Integration
http://www.sigram.com Contact: info at sigram dot com