You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@nutch.apache.org by "Rodrigo Reyes C." <rr...@corbitecso.com> on 2009/03/21 02:02:03 UTC

Problems compiling Nutch in Eclipse

Hi

I have configured my eclipse project as stated here

http://wiki.apache.org/nutch/RunNutchInEclipse0.9

Still, I am getting the following errors:

   - The return type is incompatible with Parser.getParse(Content)
   RTFParseFactory.java
   nutch/src/plugin/parse-rtf/src/java/org/apache/nutch/parse/rtf    line 52
   Java Problem
   - Type mismatch: cannot convert from ParseResult to Parse
   TestRTFParser.java
   nutch/src/plugin/parse-rtf/src/test/org/apache/nutch/parse/rtf    line 78
   Java Problem

Any ideas on what could be wrong? I already included both
http://nutch.cvs.sourceforge.net/nutch/nutch/src/plugin/parse-mp3/lib/ and
http://nutch.cvs.sourceforge.net/nutch/nutch/src/plugin/parse-rtf/lib/ jars.

Thanks in advance

-- 
Rodrigo Reyes C.

Re: Problems compiling Nutch in Eclipse

Posted by "Rodrigo Reyes C." <rr...@corbitecso.com>.
Ninad

Thanks for your answer. I have to say I am eager to read all you have
written in your blog about Nutch inner workings. I've already done
everything your blog post tells to do (and a couple more things like
downloading a couple of extra jars that are not included in the SVN
version).

Nevertheless, I am still getting the error I wrote. I think I should also
mention I am not working on 0.9 code base but on the trunk code base. Maybe
that is why I am getting this error.

Rodrigo
PS: By the way, I did managed to have Nutch crawling yesterday late at
night. Still, I haven't been able to compile this specific plugin (rtf
plugin)

2009/3/21 Ninad Raut <ni...@gmail.com>

> Check out my blog :
> http://j2eewebsearch.blogspot.com/
>
> Check out the third point...
>
> Let me know if you you get it all right. Your comments will be appreciated.
>
> Regards,
> Ninad
>
>
> On Sat, Mar 21, 2009 at 6:32 AM, Rodrigo Reyes C. <rr...@corbitecso.com>wrote:
>
>> Hi
>>
>> I have configured my eclipse project as stated here
>>
>> http://wiki.apache.org/nutch/RunNutchInEclipse0.9
>>
>> Still, I am getting the following errors:
>>
>>    - The return type is incompatible with Parser.getParse(Content)
>>    RTFParseFactory.java
>>    nutch/src/plugin/parse-rtf/src/java/org/apache/nutch/parse/rtf    line 52
>>    Java Problem
>>    - Type mismatch: cannot convert from ParseResult to Parse
>>    TestRTFParser.java
>>    nutch/src/plugin/parse-rtf/src/test/org/apache/nutch/parse/rtf    line 78
>>    Java Problem
>>
>> Any ideas on what could be wrong? I already included both
>> http://nutch.cvs.sourceforge.net/nutch/nutch/src/plugin/parse-mp3/lib/and
>> http://nutch.cvs.sourceforge.net/nutch/nutch/src/plugin/parse-rtf/lib/jars.
>>
>> Thanks in advance
>>
>> --
>> Rodrigo Reyes C.
>>
>>
>

Re: Problems compiling Nutch in Eclipse

Posted by Ninad Raut <ni...@gmail.com>.
inverted index - A sequence of (key, pointer) pairs where each pointer
points to a record in a database which contains the key value in some
particular field. The index is sorted on the key values to allow rapid
searching for a particular key value, using e.g. binary search. The
index is "inverted" in the sense that the key value is used to find
the record rather than the other way round.

in nutch indexes are created on:

<url, ParseData> from parse, for title, metadata, etc.

<url, ParseText> from parse, for text
<url, Inlinks> from invert, for anchors
<url, CrawlDatum> from fetch, for fetch date


Checkout the indexes folder after crawling.


On Mon, Mar 23, 2009 at 7:56 PM, Rodrigo Reyes C. <rr...@corbitecso.com>wrote:

> Ninad
>
> I've been reading your blog, specifically the article named "Nutch
> Architecture". I posted a comment there but I am not sure you have noticed
> it so I will post it here too.
>
> What do you mean by:
>
> *"The index is the inverted index of all of the pages the system has
> retrieved, and is created by merging all of the individual segment indexes.
> *"
>
> Can you give us an example of how the original segment index looks like and
> how it is inverted? Thanx
>
> Rodrigo
>
> 2009/3/21 Ninad Raut <ni...@gmail.com>
>
>> Check out my blog :
>>
>> http://j2eewebsearch.blogspot.com/
>>
>> Check out the third point...
>>
>> Let me know if you you get it all right. Your comments will be
>> appreciated.
>>
>> Regards,
>> Ninad
>>
>>
>> On Sat, Mar 21, 2009 at 6:32 AM, Rodrigo Reyes C. <rr...@corbitecso.com>wrote:
>>
>>> Hi
>>>
>>> I have configured my eclipse project as stated here
>>>
>>> http://wiki.apache.org/nutch/RunNutchInEclipse0.9
>>>
>>> Still, I am getting the following errors:
>>>
>>>    - The return type is incompatible with Parser.getParse(Content)
>>>    RTFParseFactory.java
>>>    nutch/src/plugin/parse-rtf/src/java/org/apache/nutch/parse/rtf    line 52
>>>    Java Problem
>>>    - Type mismatch: cannot convert from ParseResult to Parse
>>>    TestRTFParser.java
>>>    nutch/src/plugin/parse-rtf/src/test/org/apache/nutch/parse/rtf    line 78
>>>    Java Problem
>>>
>>> Any ideas on what could be wrong? I already included both
>>> http://nutch.cvs.sourceforge.net/nutch/nutch/src/plugin/parse-mp3/lib/and
>>> http://nutch.cvs.sourceforge.net/nutch/nutch/src/plugin/parse-rtf/lib/jars.
>>>
>>> Thanks in advance
>>>
>>> --
>>> Rodrigo Reyes C.
>>>
>>>
>>
>
>

Re: Problems compiling Nutch in Eclipse

Posted by "Rodrigo Reyes C." <rr...@corbitecso.com>.
Ninad

I've been reading your blog, specifically the article named "Nutch
Architecture". I posted a comment there but I am not sure you have noticed
it so I will post it here too.

What do you mean by:

*"The index is the inverted index of all of the pages the system has
retrieved, and is created by merging all of the individual segment indexes.*
"

Can you give us an example of how the original segment index looks like and
how it is inverted? Thanx

Rodrigo

2009/3/21 Ninad Raut <ni...@gmail.com>

> Check out my blog :
> http://j2eewebsearch.blogspot.com/
>
> Check out the third point...
>
> Let me know if you you get it all right. Your comments will be appreciated.
>
> Regards,
> Ninad
>
>
> On Sat, Mar 21, 2009 at 6:32 AM, Rodrigo Reyes C. <rr...@corbitecso.com>wrote:
>
>> Hi
>>
>> I have configured my eclipse project as stated here
>>
>> http://wiki.apache.org/nutch/RunNutchInEclipse0.9
>>
>> Still, I am getting the following errors:
>>
>>    - The return type is incompatible with Parser.getParse(Content)
>>    RTFParseFactory.java
>>    nutch/src/plugin/parse-rtf/src/java/org/apache/nutch/parse/rtf    line 52
>>    Java Problem
>>    - Type mismatch: cannot convert from ParseResult to Parse
>>    TestRTFParser.java
>>    nutch/src/plugin/parse-rtf/src/test/org/apache/nutch/parse/rtf    line 78
>>    Java Problem
>>
>> Any ideas on what could be wrong? I already included both
>> http://nutch.cvs.sourceforge.net/nutch/nutch/src/plugin/parse-mp3/lib/and
>> http://nutch.cvs.sourceforge.net/nutch/nutch/src/plugin/parse-rtf/lib/jars.
>>
>> Thanks in advance
>>
>> --
>> Rodrigo Reyes C.
>>
>>
>

Re: Problems compiling Nutch in Eclipse

Posted by Ninad Raut <ni...@gmail.com>.
Check out my blog :
http://j2eewebsearch.blogspot.com/

Check out the third point...

Let me know if you you get it all right. Your comments will be appreciated.

Regards,
Ninad

On Sat, Mar 21, 2009 at 6:32 AM, Rodrigo Reyes C. <rr...@corbitecso.com>wrote:

> Hi
>
> I have configured my eclipse project as stated here
>
> http://wiki.apache.org/nutch/RunNutchInEclipse0.9
>
> Still, I am getting the following errors:
>
>    - The return type is incompatible with Parser.getParse(Content)
>    RTFParseFactory.java
>    nutch/src/plugin/parse-rtf/src/java/org/apache/nutch/parse/rtf    line 52
>    Java Problem
>    - Type mismatch: cannot convert from ParseResult to Parse
>    TestRTFParser.java
>    nutch/src/plugin/parse-rtf/src/test/org/apache/nutch/parse/rtf    line 78
>    Java Problem
>
> Any ideas on what could be wrong? I already included both
> http://nutch.cvs.sourceforge.net/nutch/nutch/src/plugin/parse-mp3/lib/ and
>
> http://nutch.cvs.sourceforge.net/nutch/nutch/src/plugin/parse-rtf/lib/jars.
>
> Thanks in advance
>
> --
> Rodrigo Reyes C.
>
>

Re: Problems compiling Nutch in Eclipse

Posted by "Rodrigo Reyes C." <rr...@corbitecso.com>.
Doğacan

This answers my questions. Thank you so much.

Rodrigo

2009/3/21 Doğacan Güney <do...@gmail.com>

> RTF parser is not built by default because the jars it uses has some
> licensing issues. And it is out of sync with current trunk so it
> does not even build anymore.
>
> This issue may help:
> https://issues.apache.org/jira/browse/NUTCH-644
>
> On Sat, Mar 21, 2009 at 03:02, Rodrigo Reyes C. <rr...@corbitecso.com>
> wrote:
> > Hi
> >
> > I have configured my eclipse project as stated here
> >
> > http://wiki.apache.org/nutch/RunNutchInEclipse0.9
> >
> > Still, I am getting the following errors:
> >
> > The return type is incompatible with Parser.getParse(Content)
> > RTFParseFactory.java
> > nutch/src/plugin/parse-rtf/src/java/org/apache/nutch/parse/rtf    line 52
> > Java Problem
> > Type mismatch: cannot convert from ParseResult to Parse
> > TestRTFParser.java
> > nutch/src/plugin/parse-rtf/src/test/org/apache/nutch/parse/rtf    line 78
> > Java Problem
> >
> > Any ideas on what could be wrong? I already included both
> > http://nutch.cvs.sourceforge.net/nutch/nutch/src/plugin/parse-mp3/lib/and
> > http://nutch.cvs.sourceforge.net/nutch/nutch/src/plugin/parse-rtf/lib/jars.
> >
> > Thanks in advance
> >
> > --
> > Rodrigo Reyes C.
> >
> >
>
>
>
> --
> Doğacan Güney
>



-- 
Rodrigo Reyes C.
Software Developer

Avity LLC
105 Court Street, Suite 401
New Haven, CT 06511-6957

O rrc179
F 203-643-2002
rodrigo.reyes@avity.com
www.avity.com

Re: Problems compiling Nutch in Eclipse

Posted by Doğacan Güney <do...@gmail.com>.
RTF parser is not built by default because the jars it uses has some
licensing issues. And it is out of sync with current trunk so it
does not even build anymore.

This issue may help:
https://issues.apache.org/jira/browse/NUTCH-644

On Sat, Mar 21, 2009 at 03:02, Rodrigo Reyes C. <rr...@corbitecso.com> wrote:
> Hi
>
> I have configured my eclipse project as stated here
>
> http://wiki.apache.org/nutch/RunNutchInEclipse0.9
>
> Still, I am getting the following errors:
>
> The return type is incompatible with Parser.getParse(Content)
> RTFParseFactory.java
> nutch/src/plugin/parse-rtf/src/java/org/apache/nutch/parse/rtf    line 52
> Java Problem
> Type mismatch: cannot convert from ParseResult to Parse
> TestRTFParser.java
> nutch/src/plugin/parse-rtf/src/test/org/apache/nutch/parse/rtf    line 78
> Java Problem
>
> Any ideas on what could be wrong? I already included both
> http://nutch.cvs.sourceforge.net/nutch/nutch/src/plugin/parse-mp3/lib/ and
> http://nutch.cvs.sourceforge.net/nutch/nutch/src/plugin/parse-rtf/lib/ jars.
>
> Thanks in advance
>
> --
> Rodrigo Reyes C.
>
>



-- 
Doğacan Güney