You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@nutch.apache.org by Chip Calhoun <cc...@aip.org> on 2011/06/20 16:44:13 UTC
Questions about upgrade to Nutch 1.3
Hi everyone,
I'm a complete Nutch newbie. I installed Nutch 1.2 and Solr 1.4.0 on my machine without any trouble. I've decided to try Nutch 1.3 as it's compatible with Solr 3.1.0, which includes Solritas. I hope you can help with some problems I'm having.
The Nutch documentation still describes a lot of operations happening from $NUTCH_HOME/, but they all apparently need to happen from $NUTCH_HOME/runtime/deploy or $NUTCH_HOME/runtime/local. Which of these folders should I actually be using?
Has NutchBean been deprecated? If so, how can I run a search and make sure my crawl worked? I get no results when I try to search using Solr, so I'd like to figure out whether the problem is with my Nutch itself or with my attempt at integrating with Solr.
I get an error saying "solrurl is not set". This seems to be new to Nutch 1.3. Where do I set this?
If you can answer any of these, I'd appreciate it. Thanks!
Chip
Re: Questions about upgrade to Nutch 1.3
Posted by Markus Jelsma <ma...@openindex.io>.
You can safely use 1.3 with Solr 3.1 and Velocity. I've got the stuff up and
running as well.
On Tuesday 21 June 2011 15:45:53 Chip Calhoun wrote:
> Ahh, thanks again. Based on your advice, I'm going back to Nutch 1.2 /
> Solr 1.4 and adding the Velocity contrib. Once I get that working, I'll
> try with Nutch 1.3 again.
>
> When I try to use Velocity now, I get this message:
> java.lang.RuntimeException: Can't find resource 'velocity.properties' in
> classpath or 'solr/conf/', cwd=C:\apache\apache-solr-1.4.0\exampleThis is
> despite velocity.properties very definitely being in my
> C:\apache\apache-solr-1.4.0\example\solr\conf directory. But I've veered
> completely into Solr territory now, so I guess that's off-topic.
The properties file is not in 3.1, don't know about 1.4 but don't think do
either.
>
> >>> Markus Jelsma <ma...@openindex.io> 6/20/2011 12:43 PM >>>
>
> On Monday 20 June 2011 18:35:36 Chip Calhoun wrote:
> > Thanks for replying! I do still have a couple of questions:
> > > Markus Jelsma <ma...@openindex.io> 6/20/2011 11:34 AM >>>
> > >
> > > > On Monday 20 June 2011 16:44:13 Chip Calhoun wrote:
> > > > Hi everyone,
> > > >
> > > > I'm a complete Nutch newbie. I installed Nutch 1.2 and Solr 1.4.0 on
> > > > my machine without any trouble. I've decided to try Nutch 1.3 as
> > > > it's compatible with Solr 3.1.0, which includes Solritas. I hope
> > > > you can help with some problems I'm having.
> > >
> > > Solr 1.4.x has it has Velocity as a contrib.
> >
> > Does it? Under 1.4.0 I could never get http://localhost:8983/solr/browse
> > to work. I thought this was only added later.
>
> libs must be added manually from contrib but it is shipped.
>
> > > > I get an error saying "solrurl is not set". This seems to be new to
> > > > Nutch 1.3. Where do I set this?
> > >
> > > According to the source you're using the crawl command.
> > > Usage: Crawl <urlDir> -solr <solrURL> [-dir d] [-threads n] [-depth i]
> > > [-topN N]
> >
> > Thanks, I hadn't known about the solrURL argument at all. So would a
> > valid usage be: bin/nutch crawl urls -solr http://127.0.0.1:8983 -dir
> > solrcrawl -depth 10 -topN 50 With the new solrURL argument, are there
> > any steps I need to do after my crawl to get my content into Solr?
>
> I think so but i don't use it. Please try.
>
> > Thanks!
--
Markus Jelsma - CTO - Openindex
http://www.linkedin.com/in/markus17
050-8536620 / 06-50258350
Re: Questions about upgrade to Nutch 1.3
Posted by Chip Calhoun <cc...@aip.org>.
Ahh, thanks again. Based on your advice, I'm going back to Nutch 1.2 / Solr 1.4 and adding the Velocity contrib. Once I get that working, I'll try with Nutch 1.3 again.
When I try to use Velocity now, I get this message:
java.lang.RuntimeException: Can't find resource 'velocity.properties' in classpath or 'solr/conf/', cwd=C:\apache\apache-solr-1.4.0\exampleThis is despite velocity.properties very definitely being in my C:\apache\apache-solr-1.4.0\example\solr\conf directory. But I've veered completely into Solr territory now, so I guess that's off-topic.
>>> Markus Jelsma <ma...@openindex.io> 6/20/2011 12:43 PM >>>
On Monday 20 June 2011 18:35:36 Chip Calhoun wrote:
> Thanks for replying! I do still have a couple of questions:
> > Markus Jelsma <ma...@openindex.io> 6/20/2011 11:34 AM >>>
> >
> > > On Monday 20 June 2011 16:44:13 Chip Calhoun wrote:
> > > Hi everyone,
> > >
> > > I'm a complete Nutch newbie. I installed Nutch 1.2 and Solr 1.4.0 on
> > > my machine without any trouble. I've decided to try Nutch 1.3 as it's
> > > compatible with Solr 3.1.0, which includes Solritas. I hope you can
> > > help with some problems I'm having.
> >
> > Solr 1.4.x has it has Velocity as a contrib.
>
> Does it? Under 1.4.0 I could never get http://localhost:8983/solr/browse
> to work. I thought this was only added later.
libs must be added manually from contrib but it is shipped.
>
> > > I get an error saying "solrurl is not set". This seems to be new to
> > > Nutch 1.3. Where do I set this?
> >
> > According to the source you're using the crawl command.
> > Usage: Crawl <urlDir> -solr <solrURL> [-dir d] [-threads n] [-depth i]
> > [-topN N]
>
> Thanks, I hadn't known about the solrURL argument at all. So would a valid
> usage be: bin/nutch crawl urls -solr http://127.0.0.1:8983 -dir solrcrawl
> -depth 10 -topN 50 With the new solrURL argument, are there any steps I
> need to do after my crawl to get my content into Solr?
I think so but i don't use it. Please try.
>
> Thanks!
--
Markus Jelsma - CTO - Openindex
http://www.linkedin.com/in/markus17
050-8536620 / 06-50258350
Re: Questions about upgrade to Nutch 1.3
Posted by Markus Jelsma <ma...@openindex.io>.
On Monday 20 June 2011 18:35:36 Chip Calhoun wrote:
> Thanks for replying! I do still have a couple of questions:
> > Markus Jelsma <ma...@openindex.io> 6/20/2011 11:34 AM >>>
> >
> > > On Monday 20 June 2011 16:44:13 Chip Calhoun wrote:
> > > Hi everyone,
> > >
> > > I'm a complete Nutch newbie. I installed Nutch 1.2 and Solr 1.4.0 on
> > > my machine without any trouble. I've decided to try Nutch 1.3 as it's
> > > compatible with Solr 3.1.0, which includes Solritas. I hope you can
> > > help with some problems I'm having.
> >
> > Solr 1.4.x has it has Velocity as a contrib.
>
> Does it? Under 1.4.0 I could never get http://localhost:8983/solr/browse
> to work. I thought this was only added later.
libs must be added manually from contrib but it is shipped.
>
> > > I get an error saying "solrurl is not set". This seems to be new to
> > > Nutch 1.3. Where do I set this?
> >
> > According to the source you're using the crawl command.
> > Usage: Crawl <urlDir> -solr <solrURL> [-dir d] [-threads n] [-depth i]
> > [-topN N]
>
> Thanks, I hadn't known about the solrURL argument at all. So would a valid
> usage be: bin/nutch crawl urls -solr http://127.0.0.1:8983 -dir solrcrawl
> -depth 10 -topN 50 With the new solrURL argument, are there any steps I
> need to do after my crawl to get my content into Solr?
I think so but i don't use it. Please try.
>
> Thanks!
--
Markus Jelsma - CTO - Openindex
http://www.linkedin.com/in/markus17
050-8536620 / 06-50258350
Re: Questions about upgrade to Nutch 1.3
Posted by Chip Calhoun <cc...@aip.org>.
Thanks for replying! I do still have a couple of questions:
> Markus Jelsma <ma...@openindex.io> 6/20/2011 11:34 AM >>>
> > On Monday 20 June 2011 16:44:13 Chip Calhoun wrote:
> > Hi everyone,
> >
> > I'm a complete Nutch newbie. I installed Nutch 1.2 and Solr 1.4.0 on my
> > machine without any trouble. I've decided to try Nutch 1.3 as it's
> > compatible with Solr 3.1.0, which includes Solritas. I hope you can help
> > with some problems I'm having.
>
> Solr 1.4.x has it has Velocity as a contrib.
Does it? Under 1.4.0 I could never get http://localhost:8983/solr/browse to work. I thought this was only added later.
> > I get an error saying "solrurl is not set". This seems to be new to Nutch
> > 1.3. Where do I set this?
>
> According to the source you're using the crawl command.
> Usage: Crawl <urlDir> -solr <solrURL> [-dir d] [-threads n] [-depth i] [-topN N]
Thanks, I hadn't known about the solrURL argument at all. So would a valid usage be:
bin/nutch crawl urls -solr http://127.0.0.1:8983 -dir solrcrawl -depth 10 -topN 50
With the new solrURL argument, are there any steps I need to do after my crawl to get my content into Solr?
Thanks!
Re: Questions about upgrade to Nutch 1.3
Posted by Markus Jelsma <ma...@openindex.io>.
On Monday 20 June 2011 16:44:13 Chip Calhoun wrote:
> Hi everyone,
>
> I'm a complete Nutch newbie. I installed Nutch 1.2 and Solr 1.4.0 on my
> machine without any trouble. I've decided to try Nutch 1.3 as it's
> compatible with Solr 3.1.0, which includes Solritas. I hope you can help
> with some problems I'm having.
Solr 1.4.x has it has Velocity as a contrib.
>
> The Nutch documentation still describes a lot of operations happening from
> $NUTCH_HOME/, but they all apparently need to happen from
> $NUTCH_HOME/runtime/deploy or $NUTCH_HOME/runtime/local. Which of these
> folders should I actually be using?
Use local if you're not running on Hadoop. You can then consider runtime/local
as your NUTCH_HOME.
>
> Has NutchBean been deprecated? If so, how can I run a search and make sure
> my crawl worked? I get no results when I try to search using Solr, so I'd
> like to figure out whether the problem is with my Nutch itself or with my
> attempt at integrating with Solr.
NutchBean is for search, which is gone.
>
> I get an error saying "solrurl is not set". This seems to be new to Nutch
> 1.3. Where do I set this?
According to the source you're using the crawl command.
Usage: Crawl <urlDir> -solr <solrURL> [-dir d] [-threads n] [-depth i] [-topN
N]
>
> If you can answer any of these, I'd appreciate it. Thanks!
>
> Chip
--
Markus Jelsma - CTO - Openindex
http://www.linkedin.com/in/markus17
050-8536620 / 06-50258350