You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@nutch.apache.org by Chaz Hickman <ch...@hp.com> on 2008/01/14 19:23:38 UTC

Problems building the parse-rtf plugin

I'm relatively new to using Nutch, but I've managed to successfully 
deploy it and use it so far. I've been asked to add rtf parsing to it, 
and I'm having problems.

As far as I can tell, I need to get hold of a file called 
rtf_parser_src.jar, but I can find nowhere to get it from. The source 
referenced in the build file and the README, 
http://www.cobase.cs.ucla.edu/pub/javacc/rtf_parser_src.jar, no longer 
exists, and so I'm left unable to successfully compile.

Can anyone point out where I'm gong wrong, or direct me to where I can 
download the required file?

Thanks,
Chaz

Re: Problems building the parse-rtf plugin

Posted by Chaz Hickman <ch...@hp.com>.
Shi Wang wrote:
> 
> Hi! Hickman,
>  
> You can download it here:
> _http://nutch.cvs.sourceforge.net/nutch/nutch/src/plugin/parse-rtf/lib/_
>  
> Actually, You will have this problem if you use the version 0.9, and, 
> the other plugin you may miss is the mp3 parser, you can download it here:
> _http://nutch.cvs.sourceforge.net/nutch/nutch/src/plugin/parse-mp3/lib/_
>  
> For more information, you can see the Nutch wiki:
> http://wiki.apache.org/nutch/RunNutchInEclipse0.9
>  
> 2008/1/15, Chaz Hickman <chaz.hickman@hp.com <ma...@hp.com>>:
> 
>     I'm relatively new to using Nutch, but I've managed to successfully
>     deploy it and use it so far. I've been asked to add rtf parsing to it,
>     and I'm having problems.
> 
>     As far as I can tell, I need to get hold of a file called
>     rtf_parser_src.jar, but I can find nowhere to get it from. The source
>     referenced in the build file and the README,
>     http://www.cobase.cs.ucla.edu/pub/javacc/rtf_parser_src.jar, no longer
>     exists, and so I'm left unable to successfully compile.
> 
>     Can anyone point out where I'm gong wrong, or direct me to where I can
>     download the required file?

Shawn,

Thanks for the pointer. I'd looked through the wiki, but missed the link 
on that page. I've downloaded the file and managed to get the rtf plugin 
built, although it wasn't 100% straightforward as the build.xml for it 
insists on trying to download the parser source from that old stale url 
I mentioned. Putting a dummy file in the tmp directory fixed that and 
allowed the plugin to build.


Re: Problems building the parse-rtf plugin

Posted by Shi Wang <wa...@gmail.com>.
Hi! Hickman,

You can download it here:
*http://nutch.cvs.sourceforge.net/nutch/nutch/src/plugin/parse-rtf/lib/*

Actually, You will have this problem if you use the version 0.9, and, the
other plugin you may miss is the mp3 parser, you can download it here:
*http://nutch.cvs.sourceforge.net/nutch/nutch/src/plugin/parse-mp3/lib/*

For more information, you can see the Nutch wiki:
http://wiki.apache.org/nutch/RunNutchInEclipse0.9


Shawn Wang

2008/1/15, Chaz Hickman <ch...@hp.com>:
>
> I'm relatively new to using Nutch, but I've managed to successfully
> deploy it and use it so far. I've been asked to add rtf parsing to it,
> and I'm having problems.
>
> As far as I can tell, I need to get hold of a file called
> rtf_parser_src.jar, but I can find nowhere to get it from. The source
> referenced in the build file and the README,
> http://www.cobase.cs.ucla.edu/pub/javacc/rtf_parser_src.jar, no longer
> exists, and so I'm left unable to successfully compile.
>
> Can anyone point out where I'm gong wrong, or direct me to where I can
> download the required file?
>
> Thanks,
> Chaz
>



-- 
Best Wishes & Regards

Shawn Wang(王实), Campus Ambassador
Global Education & Research
Sun Microsystems China Ltd. SHA Office
Mobile: 13974829048
E-mail: Shi.Wang@sun.com