You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@nutch.apache.org by devang pandey <de...@gmail.com> on 2013/07/08 11:40:45 UTC

nutch 1.2 solr 3.6 integration issue

I have crawled a site successfully using NUTCH 1.2 .Now I want to integrate
this with solr 3.6 . Problem is when I am issuing command $ bin/nutch
solrindex http://localhost:8080/solr/ crawl/crawldb crawl/linkdb cra
wl/segments/* an error occurs

SolrIndexer: starting at 2013-07-08 14:52:27 java.io.IOException: Job
failed!

Please help me to solve this issue


Thankyou

Re: nutch 1.2 solr 3.6 integration issue

Posted by devang pandey <de...@gmail.com>.
can i downgrade my solr to 3.1 ?? will this work


On Mon, Jul 8, 2013 at 4:16 PM, Markus Jelsma <ma...@openindex.io>wrote:

> mm, if dropping jars doesn't work (yields the same error in the logs) i
> don't know what you could do except upgrading to a more recent Nutch. 1.5
> or higher should work.
>
>
> -----Original message-----
> > From:devang pandey <de...@gmail.com>
> > Sent: Monday 8th July 2013 12:45
> > To: user@nutch.apache.org
> > Subject: Re: nutch 1.2 solr 3.6 integration issue
> >
> > I am very sorry for wrong reply . I am using binary .
> >
> >
> > On Mon, Jul 8, 2013 at 4:10 PM, Markus Jelsma <
> markus.jelsma@openindex.io>wrote:
> >
> > > you're building nutch from source?
> > >
> > > -----Original message-----
> > > > From:devang pandey <de...@gmail.com>
> > > > Sent: Monday 8th July 2013 12:36
> > > > To: user@nutch.apache.org
> > > > Subject: Re: nutch 1.2 solr 3.6 integration issue
> > > >
> > > > Markus, I copied solrJ client jars from solr/dist to nutch lib and
> > > removed
> > > > the older ones ... I also carried an ant job ..But still this stuff
> is
> > > not
> > > > working .
> > > >
> > > >
> > > > On Mon, Jul 8, 2013 at 3:54 PM, Markus Jelsma <
> > > markus.jelsma@openindex.io>wrote:
> > > >
> > > > > Well, the API hasn't changed i think so you might consider
> upgrading
> > > the
> > > > > solrJ client in Nutch to 3.6. Check the ivy/ivy.xml in your Nutch
> > > source or
> > > > > copy the proper SolrJ client jars to the lib/ directory in case you
> > > only
> > > > > have a binary distribution.
> > > > >
> > > > > -----Original message-----
> > > > > > From:devang pandey <de...@gmail.com>
> > > > > > Sent: Monday 8th July 2013 12:20
> > > > > > To: user@nutch.apache.org
> > > > > > Subject: Re: nutch 1.2 solr 3.6 integration issue
> > > > > >
> > > > > > thanx markus for providing valuable insight .Currently i need to
> > > continue
> > > > > > working on this only .So could you please suggest if anything
> could
> > > be
> > > > > done
> > > > > > to support this version of nutch.
> > > > > >
> > > > > >
> > > > > > On Mon, Jul 8, 2013 at 3:46 PM, Markus Jelsma <
> > > > > markus.jelsma@openindex.io>wrote:
> > > > > >
> > > > > > > Ah, since you're using an old Nutch and an old SolrJ client and
> > > that
> > > > > the
> > > > > > > Javabin format has changed over time, i think your Solr is too
> new
> > > for
> > > > > the
> > > > > > > client. I'd advice to upgrade to Nutch 1.8 if you can.
> > > > > > >
> > > > > > >
> > > > > > >
> > > > > > > -----Original message-----
> > > > > > > > From:devang pandey <de...@gmail.com>
> > > > > > > > Sent: Monday 8th July 2013 12:13
> > > > > > > > To: user@nutch.apache.org
> > > > > > > > Subject: Re: nutch 1.2 solr 3.6 integration issue
> > > > > > > >
> > > > > > > > Hey.. This is my hadoop log file
> > > > > > > > ava.lang.RuntimeException: Invalid version (expected 2, but
> 60)
> > > or
> > > > > the
> > > > > > > data
> > > > > > > > in not in 'javabin' format
> > > > > > > > at
> > > > > > >
> > > > >
> > >
> org.apache.solr.common.util.JavaBinCodec.unmarshal(JavaBinCodec.java:99)
> > > > > > > > at
> > > > > > > >
> > > > > > >
> > > > >
> > >
> org.apache.solr.client.solrj.impl.BinaryResponseParser.processResponse(BinaryResponseParser.java:41)
> > > > > > > > at
> > > > > > > >
> > > > > > >
> > > > >
> > >
> org.apache.solr.client.solrj.impl.CommonsHttpSolrServer.request(CommonsHttpSolrServer.java:469)
> > > > > > > > at
> > > > > > > >
> > > > > > >
> > > > >
> > >
> org.apache.solr.client.solrj.impl.CommonsHttpSolrServer.request(CommonsHttpSolrServer.java:249)
> > > > > > > > at
> > > > > > > >
> > > > > > >
> > > > >
> > >
> org.apache.solr.client.solrj.request.AbstractUpdateRequest.process(AbstractUpdateRequest.java:105)
> > > > > > > > at
> > > org.apache.solr.client.solrj.SolrServer.add(SolrServer.java:69)
> > > > > > > > at
> > > org.apache.solr.client.solrj.SolrServer.add(SolrServer.java:54)
> > > > > > > > at
> > > org.apache.nutch.indexer.solr.SolrWriter.close(SolrWriter.java:75)
> > > > > > > > at
> > > > > > > >
> > > > > > >
> > > > >
> > >
> org.apache.nutch.indexer.IndexerOutputFormat$1.close(IndexerOutputFormat.java:48)
> > > > > > > > at
> > > > >
> org.apache.hadoop.mapred.ReduceTask.runOldReducer(ReduceTask.java:474)
> > > > > > > > at
> org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:411)
> > > > > > > > at
> > > > > > >
> > > > >
> > >
> org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:216)
> > > > > > > > 2013-07-08 15:17:39,539 ERROR solr.SolrIndexer -
> > > > > java.io.IOException: Job
> > > > > > > > failed!
> > > > > > > >
> > > > > > > >
> > > > > > > >
> > > > > > > > On Mon, Jul 8, 2013 at 3:41 PM, Markus Jelsma <
> > > > > > > markus.jelsma@openindex.io>wrote:
> > > > > > > >
> > > > > > > > > hi
> > > > > > > > >
> > > > > > > > > you still haven't provided the logs.
> > > > > > > > >
> > > > > > > > >
> > > > > > > > > -----Original message-----
> > > > > > > > > > From:devang pandey <de...@gmail.com>
> > > > > > > > > > Sent: Monday 8th July 2013 12:08
> > > > > > > > > > To: user@nutch.apache.org
> > > > > > > > > > Subject: Re: nutch 1.2 solr 3.6 integration issue
> > > > > > > > > >
> > > > > > > > > > hey markus .Tried your solution but its still not
> working .
> > > can
> > > > > you
> > > > > > > pls
> > > > > > > > > > suggest me some other way of resolving this issue.
> > > > > > > > > >
> > > > > > > > > >
> > > > > > > > > > On Mon, Jul 8, 2013 at 3:14 PM, Markus Jelsma <
> > > > > > > > > markus.jelsma@openindex.io>wrote:
> > > > > > > > > >
> > > > > > > > > > > You need to provide the log output. But i think
> > > > > crawl/segments/*
> > > > > > > is the
> > > > > > > > > > > problem. You must either do seg1 seg2 seg3 or -dir
> > > segments/.
> > > > > No
> > > > > > > > > wildcards
> > > > > > > > > > > supported!
> > > > > > > > > > >
> > > > > > > > > > > Cheers
> > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > > > -----Original message-----
> > > > > > > > > > > > From:devang pandey <de...@gmail.com>
> > > > > > > > > > > > Sent: Monday 8th July 2013 11:41
> > > > > > > > > > > > To: user@nutch.apache.org
> > > > > > > > > > > > Subject: nutch 1.2 solr 3.6 integration issue
> > > > > > > > > > > >
> > > > > > > > > > > > I have crawled a site successfully using NUTCH 1.2
> .Now I
> > > > > want to
> > > > > > > > > > > integrate
> > > > > > > > > > > > this with solr 3.6 . Problem is when I am issuing
> > > command $
> > > > > > > bin/nutch
> > > > > > > > > > > > solrindex http://localhost:8080/solr/ crawl/crawldb
> > > > > > > crawl/linkdb cra
> > > > > > > > > > > > wl/segments/* an error occurs
> > > > > > > > > > > >
> > > > > > > > > > > > SolrIndexer: starting at 2013-07-08 14:52:27
> > > > > > > java.io.IOException: Job
> > > > > > > > > > > > failed!
> > > > > > > > > > > >
> > > > > > > > > > > > Please help me to solve this issue
> > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > > > Thankyou
> > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
>

Re: nutch 1.2 solr 3.6 integration issue

Posted by devang pandey <de...@gmail.com>.
I am very sorry for wrong reply . I am using binary .


On Mon, Jul 8, 2013 at 4:10 PM, Markus Jelsma <ma...@openindex.io>wrote:

> you're building nutch from source?
>
> -----Original message-----
> > From:devang pandey <de...@gmail.com>
> > Sent: Monday 8th July 2013 12:36
> > To: user@nutch.apache.org
> > Subject: Re: nutch 1.2 solr 3.6 integration issue
> >
> > Markus, I copied solrJ client jars from solr/dist to nutch lib and
> removed
> > the older ones ... I also carried an ant job ..But still this stuff is
> not
> > working .
> >
> >
> > On Mon, Jul 8, 2013 at 3:54 PM, Markus Jelsma <
> markus.jelsma@openindex.io>wrote:
> >
> > > Well, the API hasn't changed i think so you might consider upgrading
> the
> > > solrJ client in Nutch to 3.6. Check the ivy/ivy.xml in your Nutch
> source or
> > > copy the proper SolrJ client jars to the lib/ directory in case you
> only
> > > have a binary distribution.
> > >
> > > -----Original message-----
> > > > From:devang pandey <de...@gmail.com>
> > > > Sent: Monday 8th July 2013 12:20
> > > > To: user@nutch.apache.org
> > > > Subject: Re: nutch 1.2 solr 3.6 integration issue
> > > >
> > > > thanx markus for providing valuable insight .Currently i need to
> continue
> > > > working on this only .So could you please suggest if anything could
> be
> > > done
> > > > to support this version of nutch.
> > > >
> > > >
> > > > On Mon, Jul 8, 2013 at 3:46 PM, Markus Jelsma <
> > > markus.jelsma@openindex.io>wrote:
> > > >
> > > > > Ah, since you're using an old Nutch and an old SolrJ client and
> that
> > > the
> > > > > Javabin format has changed over time, i think your Solr is too new
> for
> > > the
> > > > > client. I'd advice to upgrade to Nutch 1.8 if you can.
> > > > >
> > > > >
> > > > >
> > > > > -----Original message-----
> > > > > > From:devang pandey <de...@gmail.com>
> > > > > > Sent: Monday 8th July 2013 12:13
> > > > > > To: user@nutch.apache.org
> > > > > > Subject: Re: nutch 1.2 solr 3.6 integration issue
> > > > > >
> > > > > > Hey.. This is my hadoop log file
> > > > > > ava.lang.RuntimeException: Invalid version (expected 2, but 60)
> or
> > > the
> > > > > data
> > > > > > in not in 'javabin' format
> > > > > > at
> > > > >
> > >
> org.apache.solr.common.util.JavaBinCodec.unmarshal(JavaBinCodec.java:99)
> > > > > > at
> > > > > >
> > > > >
> > >
> org.apache.solr.client.solrj.impl.BinaryResponseParser.processResponse(BinaryResponseParser.java:41)
> > > > > > at
> > > > > >
> > > > >
> > >
> org.apache.solr.client.solrj.impl.CommonsHttpSolrServer.request(CommonsHttpSolrServer.java:469)
> > > > > > at
> > > > > >
> > > > >
> > >
> org.apache.solr.client.solrj.impl.CommonsHttpSolrServer.request(CommonsHttpSolrServer.java:249)
> > > > > > at
> > > > > >
> > > > >
> > >
> org.apache.solr.client.solrj.request.AbstractUpdateRequest.process(AbstractUpdateRequest.java:105)
> > > > > > at
> org.apache.solr.client.solrj.SolrServer.add(SolrServer.java:69)
> > > > > > at
> org.apache.solr.client.solrj.SolrServer.add(SolrServer.java:54)
> > > > > > at
> org.apache.nutch.indexer.solr.SolrWriter.close(SolrWriter.java:75)
> > > > > > at
> > > > > >
> > > > >
> > >
> org.apache.nutch.indexer.IndexerOutputFormat$1.close(IndexerOutputFormat.java:48)
> > > > > > at
> > > org.apache.hadoop.mapred.ReduceTask.runOldReducer(ReduceTask.java:474)
> > > > > > at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:411)
> > > > > > at
> > > > >
> > >
> org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:216)
> > > > > > 2013-07-08 15:17:39,539 ERROR solr.SolrIndexer -
> > > java.io.IOException: Job
> > > > > > failed!
> > > > > >
> > > > > >
> > > > > >
> > > > > > On Mon, Jul 8, 2013 at 3:41 PM, Markus Jelsma <
> > > > > markus.jelsma@openindex.io>wrote:
> > > > > >
> > > > > > > hi
> > > > > > >
> > > > > > > you still haven't provided the logs.
> > > > > > >
> > > > > > >
> > > > > > > -----Original message-----
> > > > > > > > From:devang pandey <de...@gmail.com>
> > > > > > > > Sent: Monday 8th July 2013 12:08
> > > > > > > > To: user@nutch.apache.org
> > > > > > > > Subject: Re: nutch 1.2 solr 3.6 integration issue
> > > > > > > >
> > > > > > > > hey markus .Tried your solution but its still not working .
> can
> > > you
> > > > > pls
> > > > > > > > suggest me some other way of resolving this issue.
> > > > > > > >
> > > > > > > >
> > > > > > > > On Mon, Jul 8, 2013 at 3:14 PM, Markus Jelsma <
> > > > > > > markus.jelsma@openindex.io>wrote:
> > > > > > > >
> > > > > > > > > You need to provide the log output. But i think
> > > crawl/segments/*
> > > > > is the
> > > > > > > > > problem. You must either do seg1 seg2 seg3 or -dir
> segments/.
> > > No
> > > > > > > wildcards
> > > > > > > > > supported!
> > > > > > > > >
> > > > > > > > > Cheers
> > > > > > > > >
> > > > > > > > >
> > > > > > > > >
> > > > > > > > > -----Original message-----
> > > > > > > > > > From:devang pandey <de...@gmail.com>
> > > > > > > > > > Sent: Monday 8th July 2013 11:41
> > > > > > > > > > To: user@nutch.apache.org
> > > > > > > > > > Subject: nutch 1.2 solr 3.6 integration issue
> > > > > > > > > >
> > > > > > > > > > I have crawled a site successfully using NUTCH 1.2 .Now I
> > > want to
> > > > > > > > > integrate
> > > > > > > > > > this with solr 3.6 . Problem is when I am issuing
> command $
> > > > > bin/nutch
> > > > > > > > > > solrindex http://localhost:8080/solr/ crawl/crawldb
> > > > > crawl/linkdb cra
> > > > > > > > > > wl/segments/* an error occurs
> > > > > > > > > >
> > > > > > > > > > SolrIndexer: starting at 2013-07-08 14:52:27
> > > > > java.io.IOException: Job
> > > > > > > > > > failed!
> > > > > > > > > >
> > > > > > > > > > Please help me to solve this issue
> > > > > > > > > >
> > > > > > > > > >
> > > > > > > > > > Thankyou
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
>

Re: nutch 1.2 solr 3.6 integration issue

Posted by devang pandey <de...@gmail.com>.
Markus, I copied solrJ client jars from solr/dist to nutch lib and removed
the older ones ... I also carried an ant job ..But still this stuff is not
working .


On Mon, Jul 8, 2013 at 3:54 PM, Markus Jelsma <ma...@openindex.io>wrote:

> Well, the API hasn't changed i think so you might consider upgrading the
> solrJ client in Nutch to 3.6. Check the ivy/ivy.xml in your Nutch source or
> copy the proper SolrJ client jars to the lib/ directory in case you only
> have a binary distribution.
>
> -----Original message-----
> > From:devang pandey <de...@gmail.com>
> > Sent: Monday 8th July 2013 12:20
> > To: user@nutch.apache.org
> > Subject: Re: nutch 1.2 solr 3.6 integration issue
> >
> > thanx markus for providing valuable insight .Currently i need to continue
> > working on this only .So could you please suggest if anything could be
> done
> > to support this version of nutch.
> >
> >
> > On Mon, Jul 8, 2013 at 3:46 PM, Markus Jelsma <
> markus.jelsma@openindex.io>wrote:
> >
> > > Ah, since you're using an old Nutch and an old SolrJ client and that
> the
> > > Javabin format has changed over time, i think your Solr is too new for
> the
> > > client. I'd advice to upgrade to Nutch 1.8 if you can.
> > >
> > >
> > >
> > > -----Original message-----
> > > > From:devang pandey <de...@gmail.com>
> > > > Sent: Monday 8th July 2013 12:13
> > > > To: user@nutch.apache.org
> > > > Subject: Re: nutch 1.2 solr 3.6 integration issue
> > > >
> > > > Hey.. This is my hadoop log file
> > > > ava.lang.RuntimeException: Invalid version (expected 2, but 60) or
> the
> > > data
> > > > in not in 'javabin' format
> > > > at
> > >
> org.apache.solr.common.util.JavaBinCodec.unmarshal(JavaBinCodec.java:99)
> > > > at
> > > >
> > >
> org.apache.solr.client.solrj.impl.BinaryResponseParser.processResponse(BinaryResponseParser.java:41)
> > > > at
> > > >
> > >
> org.apache.solr.client.solrj.impl.CommonsHttpSolrServer.request(CommonsHttpSolrServer.java:469)
> > > > at
> > > >
> > >
> org.apache.solr.client.solrj.impl.CommonsHttpSolrServer.request(CommonsHttpSolrServer.java:249)
> > > > at
> > > >
> > >
> org.apache.solr.client.solrj.request.AbstractUpdateRequest.process(AbstractUpdateRequest.java:105)
> > > > at org.apache.solr.client.solrj.SolrServer.add(SolrServer.java:69)
> > > > at org.apache.solr.client.solrj.SolrServer.add(SolrServer.java:54)
> > > > at org.apache.nutch.indexer.solr.SolrWriter.close(SolrWriter.java:75)
> > > > at
> > > >
> > >
> org.apache.nutch.indexer.IndexerOutputFormat$1.close(IndexerOutputFormat.java:48)
> > > > at
> org.apache.hadoop.mapred.ReduceTask.runOldReducer(ReduceTask.java:474)
> > > > at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:411)
> > > > at
> > >
> org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:216)
> > > > 2013-07-08 15:17:39,539 ERROR solr.SolrIndexer -
> java.io.IOException: Job
> > > > failed!
> > > >
> > > >
> > > >
> > > > On Mon, Jul 8, 2013 at 3:41 PM, Markus Jelsma <
> > > markus.jelsma@openindex.io>wrote:
> > > >
> > > > > hi
> > > > >
> > > > > you still haven't provided the logs.
> > > > >
> > > > >
> > > > > -----Original message-----
> > > > > > From:devang pandey <de...@gmail.com>
> > > > > > Sent: Monday 8th July 2013 12:08
> > > > > > To: user@nutch.apache.org
> > > > > > Subject: Re: nutch 1.2 solr 3.6 integration issue
> > > > > >
> > > > > > hey markus .Tried your solution but its still not working . can
> you
> > > pls
> > > > > > suggest me some other way of resolving this issue.
> > > > > >
> > > > > >
> > > > > > On Mon, Jul 8, 2013 at 3:14 PM, Markus Jelsma <
> > > > > markus.jelsma@openindex.io>wrote:
> > > > > >
> > > > > > > You need to provide the log output. But i think
> crawl/segments/*
> > > is the
> > > > > > > problem. You must either do seg1 seg2 seg3 or -dir segments/.
> No
> > > > > wildcards
> > > > > > > supported!
> > > > > > >
> > > > > > > Cheers
> > > > > > >
> > > > > > >
> > > > > > >
> > > > > > > -----Original message-----
> > > > > > > > From:devang pandey <de...@gmail.com>
> > > > > > > > Sent: Monday 8th July 2013 11:41
> > > > > > > > To: user@nutch.apache.org
> > > > > > > > Subject: nutch 1.2 solr 3.6 integration issue
> > > > > > > >
> > > > > > > > I have crawled a site successfully using NUTCH 1.2 .Now I
> want to
> > > > > > > integrate
> > > > > > > > this with solr 3.6 . Problem is when I am issuing command $
> > > bin/nutch
> > > > > > > > solrindex http://localhost:8080/solr/ crawl/crawldb
> > > crawl/linkdb cra
> > > > > > > > wl/segments/* an error occurs
> > > > > > > >
> > > > > > > > SolrIndexer: starting at 2013-07-08 14:52:27
> > > java.io.IOException: Job
> > > > > > > > failed!
> > > > > > > >
> > > > > > > > Please help me to solve this issue
> > > > > > > >
> > > > > > > >
> > > > > > > > Thankyou
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
>

Re: nutch 1.2 solr 3.6 integration issue

Posted by devang pandey <de...@gmail.com>.
thanx markus for providing valuable insight .Currently i need to continue
working on this only .So could you please suggest if anything could be done
to support this version of nutch.


On Mon, Jul 8, 2013 at 3:46 PM, Markus Jelsma <ma...@openindex.io>wrote:

> Ah, since you're using an old Nutch and an old SolrJ client and that the
> Javabin format has changed over time, i think your Solr is too new for the
> client. I'd advice to upgrade to Nutch 1.8 if you can.
>
>
>
> -----Original message-----
> > From:devang pandey <de...@gmail.com>
> > Sent: Monday 8th July 2013 12:13
> > To: user@nutch.apache.org
> > Subject: Re: nutch 1.2 solr 3.6 integration issue
> >
> > Hey.. This is my hadoop log file
> > ava.lang.RuntimeException: Invalid version (expected 2, but 60) or the
> data
> > in not in 'javabin' format
> > at
> org.apache.solr.common.util.JavaBinCodec.unmarshal(JavaBinCodec.java:99)
> > at
> >
> org.apache.solr.client.solrj.impl.BinaryResponseParser.processResponse(BinaryResponseParser.java:41)
> > at
> >
> org.apache.solr.client.solrj.impl.CommonsHttpSolrServer.request(CommonsHttpSolrServer.java:469)
> > at
> >
> org.apache.solr.client.solrj.impl.CommonsHttpSolrServer.request(CommonsHttpSolrServer.java:249)
> > at
> >
> org.apache.solr.client.solrj.request.AbstractUpdateRequest.process(AbstractUpdateRequest.java:105)
> > at org.apache.solr.client.solrj.SolrServer.add(SolrServer.java:69)
> > at org.apache.solr.client.solrj.SolrServer.add(SolrServer.java:54)
> > at org.apache.nutch.indexer.solr.SolrWriter.close(SolrWriter.java:75)
> > at
> >
> org.apache.nutch.indexer.IndexerOutputFormat$1.close(IndexerOutputFormat.java:48)
> > at org.apache.hadoop.mapred.ReduceTask.runOldReducer(ReduceTask.java:474)
> > at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:411)
> > at
> org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:216)
> > 2013-07-08 15:17:39,539 ERROR solr.SolrIndexer - java.io.IOException: Job
> > failed!
> >
> >
> >
> > On Mon, Jul 8, 2013 at 3:41 PM, Markus Jelsma <
> markus.jelsma@openindex.io>wrote:
> >
> > > hi
> > >
> > > you still haven't provided the logs.
> > >
> > >
> > > -----Original message-----
> > > > From:devang pandey <de...@gmail.com>
> > > > Sent: Monday 8th July 2013 12:08
> > > > To: user@nutch.apache.org
> > > > Subject: Re: nutch 1.2 solr 3.6 integration issue
> > > >
> > > > hey markus .Tried your solution but its still not working . can you
> pls
> > > > suggest me some other way of resolving this issue.
> > > >
> > > >
> > > > On Mon, Jul 8, 2013 at 3:14 PM, Markus Jelsma <
> > > markus.jelsma@openindex.io>wrote:
> > > >
> > > > > You need to provide the log output. But i think crawl/segments/*
> is the
> > > > > problem. You must either do seg1 seg2 seg3 or -dir segments/. No
> > > wildcards
> > > > > supported!
> > > > >
> > > > > Cheers
> > > > >
> > > > >
> > > > >
> > > > > -----Original message-----
> > > > > > From:devang pandey <de...@gmail.com>
> > > > > > Sent: Monday 8th July 2013 11:41
> > > > > > To: user@nutch.apache.org
> > > > > > Subject: nutch 1.2 solr 3.6 integration issue
> > > > > >
> > > > > > I have crawled a site successfully using NUTCH 1.2 .Now I want to
> > > > > integrate
> > > > > > this with solr 3.6 . Problem is when I am issuing command $
> bin/nutch
> > > > > > solrindex http://localhost:8080/solr/ crawl/crawldb
> crawl/linkdb cra
> > > > > > wl/segments/* an error occurs
> > > > > >
> > > > > > SolrIndexer: starting at 2013-07-08 14:52:27
> java.io.IOException: Job
> > > > > > failed!
> > > > > >
> > > > > > Please help me to solve this issue
> > > > > >
> > > > > >
> > > > > > Thankyou
> > > > > >
> > > > >
> > > >
> > >
> >
>

RE: nutch 1.2 solr 3.6 integration issue

Posted by Markus Jelsma <ma...@openindex.io>.
Ah, since you're using an old Nutch and an old SolrJ client and that the Javabin format has changed over time, i think your Solr is too new for the client. I'd advice to upgrade to Nutch 1.8 if you can.

 
 
-----Original message-----
> From:devang pandey <de...@gmail.com>
> Sent: Monday 8th July 2013 12:13
> To: user@nutch.apache.org
> Subject: Re: nutch 1.2 solr 3.6 integration issue
> 
> Hey.. This is my hadoop log file
> ava.lang.RuntimeException: Invalid version (expected 2, but 60) or the data
> in not in 'javabin' format
> at org.apache.solr.common.util.JavaBinCodec.unmarshal(JavaBinCodec.java:99)
> at
> org.apache.solr.client.solrj.impl.BinaryResponseParser.processResponse(BinaryResponseParser.java:41)
> at
> org.apache.solr.client.solrj.impl.CommonsHttpSolrServer.request(CommonsHttpSolrServer.java:469)
> at
> org.apache.solr.client.solrj.impl.CommonsHttpSolrServer.request(CommonsHttpSolrServer.java:249)
> at
> org.apache.solr.client.solrj.request.AbstractUpdateRequest.process(AbstractUpdateRequest.java:105)
> at org.apache.solr.client.solrj.SolrServer.add(SolrServer.java:69)
> at org.apache.solr.client.solrj.SolrServer.add(SolrServer.java:54)
> at org.apache.nutch.indexer.solr.SolrWriter.close(SolrWriter.java:75)
> at
> org.apache.nutch.indexer.IndexerOutputFormat$1.close(IndexerOutputFormat.java:48)
> at org.apache.hadoop.mapred.ReduceTask.runOldReducer(ReduceTask.java:474)
> at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:411)
> at org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:216)
> 2013-07-08 15:17:39,539 ERROR solr.SolrIndexer - java.io.IOException: Job
> failed!
> 
> 
> 
> On Mon, Jul 8, 2013 at 3:41 PM, Markus Jelsma <ma...@openindex.io>wrote:
> 
> > hi
> >
> > you still haven't provided the logs.
> >
> >
> > -----Original message-----
> > > From:devang pandey <de...@gmail.com>
> > > Sent: Monday 8th July 2013 12:08
> > > To: user@nutch.apache.org
> > > Subject: Re: nutch 1.2 solr 3.6 integration issue
> > >
> > > hey markus .Tried your solution but its still not working . can you pls
> > > suggest me some other way of resolving this issue.
> > >
> > >
> > > On Mon, Jul 8, 2013 at 3:14 PM, Markus Jelsma <
> > markus.jelsma@openindex.io>wrote:
> > >
> > > > You need to provide the log output. But i think crawl/segments/* is the
> > > > problem. You must either do seg1 seg2 seg3 or -dir segments/. No
> > wildcards
> > > > supported!
> > > >
> > > > Cheers
> > > >
> > > >
> > > >
> > > > -----Original message-----
> > > > > From:devang pandey <de...@gmail.com>
> > > > > Sent: Monday 8th July 2013 11:41
> > > > > To: user@nutch.apache.org
> > > > > Subject: nutch 1.2 solr 3.6 integration issue
> > > > >
> > > > > I have crawled a site successfully using NUTCH 1.2 .Now I want to
> > > > integrate
> > > > > this with solr 3.6 . Problem is when I am issuing command $ bin/nutch
> > > > > solrindex http://localhost:8080/solr/ crawl/crawldb crawl/linkdb cra
> > > > > wl/segments/* an error occurs
> > > > >
> > > > > SolrIndexer: starting at 2013-07-08 14:52:27 java.io.IOException: Job
> > > > > failed!
> > > > >
> > > > > Please help me to solve this issue
> > > > >
> > > > >
> > > > > Thankyou
> > > > >
> > > >
> > >
> >
> 

RE: nutch 1.2 solr 3.6 integration issue

Posted by Markus Jelsma <ma...@openindex.io>.
mm, if dropping jars doesn't work (yields the same error in the logs) i don't know what you could do except upgrading to a more recent Nutch. 1.5 or higher should work.
 
 
-----Original message-----
> From:devang pandey <de...@gmail.com>
> Sent: Monday 8th July 2013 12:45
> To: user@nutch.apache.org
> Subject: Re: nutch 1.2 solr 3.6 integration issue
> 
> I am very sorry for wrong reply . I am using binary .
> 
> 
> On Mon, Jul 8, 2013 at 4:10 PM, Markus Jelsma <ma...@openindex.io>wrote:
> 
> > you're building nutch from source?
> >
> > -----Original message-----
> > > From:devang pandey <de...@gmail.com>
> > > Sent: Monday 8th July 2013 12:36
> > > To: user@nutch.apache.org
> > > Subject: Re: nutch 1.2 solr 3.6 integration issue
> > >
> > > Markus, I copied solrJ client jars from solr/dist to nutch lib and
> > removed
> > > the older ones ... I also carried an ant job ..But still this stuff is
> > not
> > > working .
> > >
> > >
> > > On Mon, Jul 8, 2013 at 3:54 PM, Markus Jelsma <
> > markus.jelsma@openindex.io>wrote:
> > >
> > > > Well, the API hasn't changed i think so you might consider upgrading
> > the
> > > > solrJ client in Nutch to 3.6. Check the ivy/ivy.xml in your Nutch
> > source or
> > > > copy the proper SolrJ client jars to the lib/ directory in case you
> > only
> > > > have a binary distribution.
> > > >
> > > > -----Original message-----
> > > > > From:devang pandey <de...@gmail.com>
> > > > > Sent: Monday 8th July 2013 12:20
> > > > > To: user@nutch.apache.org
> > > > > Subject: Re: nutch 1.2 solr 3.6 integration issue
> > > > >
> > > > > thanx markus for providing valuable insight .Currently i need to
> > continue
> > > > > working on this only .So could you please suggest if anything could
> > be
> > > > done
> > > > > to support this version of nutch.
> > > > >
> > > > >
> > > > > On Mon, Jul 8, 2013 at 3:46 PM, Markus Jelsma <
> > > > markus.jelsma@openindex.io>wrote:
> > > > >
> > > > > > Ah, since you're using an old Nutch and an old SolrJ client and
> > that
> > > > the
> > > > > > Javabin format has changed over time, i think your Solr is too new
> > for
> > > > the
> > > > > > client. I'd advice to upgrade to Nutch 1.8 if you can.
> > > > > >
> > > > > >
> > > > > >
> > > > > > -----Original message-----
> > > > > > > From:devang pandey <de...@gmail.com>
> > > > > > > Sent: Monday 8th July 2013 12:13
> > > > > > > To: user@nutch.apache.org
> > > > > > > Subject: Re: nutch 1.2 solr 3.6 integration issue
> > > > > > >
> > > > > > > Hey.. This is my hadoop log file
> > > > > > > ava.lang.RuntimeException: Invalid version (expected 2, but 60)
> > or
> > > > the
> > > > > > data
> > > > > > > in not in 'javabin' format
> > > > > > > at
> > > > > >
> > > >
> > org.apache.solr.common.util.JavaBinCodec.unmarshal(JavaBinCodec.java:99)
> > > > > > > at
> > > > > > >
> > > > > >
> > > >
> > org.apache.solr.client.solrj.impl.BinaryResponseParser.processResponse(BinaryResponseParser.java:41)
> > > > > > > at
> > > > > > >
> > > > > >
> > > >
> > org.apache.solr.client.solrj.impl.CommonsHttpSolrServer.request(CommonsHttpSolrServer.java:469)
> > > > > > > at
> > > > > > >
> > > > > >
> > > >
> > org.apache.solr.client.solrj.impl.CommonsHttpSolrServer.request(CommonsHttpSolrServer.java:249)
> > > > > > > at
> > > > > > >
> > > > > >
> > > >
> > org.apache.solr.client.solrj.request.AbstractUpdateRequest.process(AbstractUpdateRequest.java:105)
> > > > > > > at
> > org.apache.solr.client.solrj.SolrServer.add(SolrServer.java:69)
> > > > > > > at
> > org.apache.solr.client.solrj.SolrServer.add(SolrServer.java:54)
> > > > > > > at
> > org.apache.nutch.indexer.solr.SolrWriter.close(SolrWriter.java:75)
> > > > > > > at
> > > > > > >
> > > > > >
> > > >
> > org.apache.nutch.indexer.IndexerOutputFormat$1.close(IndexerOutputFormat.java:48)
> > > > > > > at
> > > > org.apache.hadoop.mapred.ReduceTask.runOldReducer(ReduceTask.java:474)
> > > > > > > at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:411)
> > > > > > > at
> > > > > >
> > > >
> > org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:216)
> > > > > > > 2013-07-08 15:17:39,539 ERROR solr.SolrIndexer -
> > > > java.io.IOException: Job
> > > > > > > failed!
> > > > > > >
> > > > > > >
> > > > > > >
> > > > > > > On Mon, Jul 8, 2013 at 3:41 PM, Markus Jelsma <
> > > > > > markus.jelsma@openindex.io>wrote:
> > > > > > >
> > > > > > > > hi
> > > > > > > >
> > > > > > > > you still haven't provided the logs.
> > > > > > > >
> > > > > > > >
> > > > > > > > -----Original message-----
> > > > > > > > > From:devang pandey <de...@gmail.com>
> > > > > > > > > Sent: Monday 8th July 2013 12:08
> > > > > > > > > To: user@nutch.apache.org
> > > > > > > > > Subject: Re: nutch 1.2 solr 3.6 integration issue
> > > > > > > > >
> > > > > > > > > hey markus .Tried your solution but its still not working .
> > can
> > > > you
> > > > > > pls
> > > > > > > > > suggest me some other way of resolving this issue.
> > > > > > > > >
> > > > > > > > >
> > > > > > > > > On Mon, Jul 8, 2013 at 3:14 PM, Markus Jelsma <
> > > > > > > > markus.jelsma@openindex.io>wrote:
> > > > > > > > >
> > > > > > > > > > You need to provide the log output. But i think
> > > > crawl/segments/*
> > > > > > is the
> > > > > > > > > > problem. You must either do seg1 seg2 seg3 or -dir
> > segments/.
> > > > No
> > > > > > > > wildcards
> > > > > > > > > > supported!
> > > > > > > > > >
> > > > > > > > > > Cheers
> > > > > > > > > >
> > > > > > > > > >
> > > > > > > > > >
> > > > > > > > > > -----Original message-----
> > > > > > > > > > > From:devang pandey <de...@gmail.com>
> > > > > > > > > > > Sent: Monday 8th July 2013 11:41
> > > > > > > > > > > To: user@nutch.apache.org
> > > > > > > > > > > Subject: nutch 1.2 solr 3.6 integration issue
> > > > > > > > > > >
> > > > > > > > > > > I have crawled a site successfully using NUTCH 1.2 .Now I
> > > > want to
> > > > > > > > > > integrate
> > > > > > > > > > > this with solr 3.6 . Problem is when I am issuing
> > command $
> > > > > > bin/nutch
> > > > > > > > > > > solrindex http://localhost:8080/solr/ crawl/crawldb
> > > > > > crawl/linkdb cra
> > > > > > > > > > > wl/segments/* an error occurs
> > > > > > > > > > >
> > > > > > > > > > > SolrIndexer: starting at 2013-07-08 14:52:27
> > > > > > java.io.IOException: Job
> > > > > > > > > > > failed!
> > > > > > > > > > >
> > > > > > > > > > > Please help me to solve this issue
> > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > > > Thankyou
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> 

Re: nutch 1.2 solr 3.6 integration issue

Posted by devang pandey <de...@gmail.com>.
Hey.. This is my hadoop log file
ava.lang.RuntimeException: Invalid version (expected 2, but 60) or the data
in not in 'javabin' format
at org.apache.solr.common.util.JavaBinCodec.unmarshal(JavaBinCodec.java:99)
at
org.apache.solr.client.solrj.impl.BinaryResponseParser.processResponse(BinaryResponseParser.java:41)
at
org.apache.solr.client.solrj.impl.CommonsHttpSolrServer.request(CommonsHttpSolrServer.java:469)
at
org.apache.solr.client.solrj.impl.CommonsHttpSolrServer.request(CommonsHttpSolrServer.java:249)
at
org.apache.solr.client.solrj.request.AbstractUpdateRequest.process(AbstractUpdateRequest.java:105)
at org.apache.solr.client.solrj.SolrServer.add(SolrServer.java:69)
at org.apache.solr.client.solrj.SolrServer.add(SolrServer.java:54)
at org.apache.nutch.indexer.solr.SolrWriter.close(SolrWriter.java:75)
at
org.apache.nutch.indexer.IndexerOutputFormat$1.close(IndexerOutputFormat.java:48)
at org.apache.hadoop.mapred.ReduceTask.runOldReducer(ReduceTask.java:474)
at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:411)
at org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:216)
2013-07-08 15:17:39,539 ERROR solr.SolrIndexer - java.io.IOException: Job
failed!



On Mon, Jul 8, 2013 at 3:41 PM, Markus Jelsma <ma...@openindex.io>wrote:

> hi
>
> you still haven't provided the logs.
>
>
> -----Original message-----
> > From:devang pandey <de...@gmail.com>
> > Sent: Monday 8th July 2013 12:08
> > To: user@nutch.apache.org
> > Subject: Re: nutch 1.2 solr 3.6 integration issue
> >
> > hey markus .Tried your solution but its still not working . can you pls
> > suggest me some other way of resolving this issue.
> >
> >
> > On Mon, Jul 8, 2013 at 3:14 PM, Markus Jelsma <
> markus.jelsma@openindex.io>wrote:
> >
> > > You need to provide the log output. But i think crawl/segments/* is the
> > > problem. You must either do seg1 seg2 seg3 or -dir segments/. No
> wildcards
> > > supported!
> > >
> > > Cheers
> > >
> > >
> > >
> > > -----Original message-----
> > > > From:devang pandey <de...@gmail.com>
> > > > Sent: Monday 8th July 2013 11:41
> > > > To: user@nutch.apache.org
> > > > Subject: nutch 1.2 solr 3.6 integration issue
> > > >
> > > > I have crawled a site successfully using NUTCH 1.2 .Now I want to
> > > integrate
> > > > this with solr 3.6 . Problem is when I am issuing command $ bin/nutch
> > > > solrindex http://localhost:8080/solr/ crawl/crawldb crawl/linkdb cra
> > > > wl/segments/* an error occurs
> > > >
> > > > SolrIndexer: starting at 2013-07-08 14:52:27 java.io.IOException: Job
> > > > failed!
> > > >
> > > > Please help me to solve this issue
> > > >
> > > >
> > > > Thankyou
> > > >
> > >
> >
>

Re: nutch 1.2 solr 3.6 integration issue

Posted by devang pandey <de...@gmail.com>.
hey markus .Tried your solution but its still not working . can you pls
suggest me some other way of resolving this issue.


On Mon, Jul 8, 2013 at 3:14 PM, Markus Jelsma <ma...@openindex.io>wrote:

> You need to provide the log output. But i think crawl/segments/* is the
> problem. You must either do seg1 seg2 seg3 or -dir segments/. No wildcards
> supported!
>
> Cheers
>
>
>
> -----Original message-----
> > From:devang pandey <de...@gmail.com>
> > Sent: Monday 8th July 2013 11:41
> > To: user@nutch.apache.org
> > Subject: nutch 1.2 solr 3.6 integration issue
> >
> > I have crawled a site successfully using NUTCH 1.2 .Now I want to
> integrate
> > this with solr 3.6 . Problem is when I am issuing command $ bin/nutch
> > solrindex http://localhost:8080/solr/ crawl/crawldb crawl/linkdb cra
> > wl/segments/* an error occurs
> >
> > SolrIndexer: starting at 2013-07-08 14:52:27 java.io.IOException: Job
> > failed!
> >
> > Please help me to solve this issue
> >
> >
> > Thankyou
> >
>

RE: nutch 1.2 solr 3.6 integration issue

Posted by Markus Jelsma <ma...@openindex.io>.
You need to provide the log output. But i think crawl/segments/* is the problem. You must either do seg1 seg2 seg3 or -dir segments/. No wildcards supported!

Cheers

 
 
-----Original message-----
> From:devang pandey <de...@gmail.com>
> Sent: Monday 8th July 2013 11:41
> To: user@nutch.apache.org
> Subject: nutch 1.2 solr 3.6 integration issue
> 
> I have crawled a site successfully using NUTCH 1.2 .Now I want to integrate
> this with solr 3.6 . Problem is when I am issuing command $ bin/nutch
> solrindex http://localhost:8080/solr/ crawl/crawldb crawl/linkdb cra
> wl/segments/* an error occurs
> 
> SolrIndexer: starting at 2013-07-08 14:52:27 java.io.IOException: Job
> failed!
> 
> Please help me to solve this issue
> 
> 
> Thankyou
> 

RE: nutch 1.2 solr 3.6 integration issue

Posted by Markus Jelsma <ma...@openindex.io>.
Well, the API hasn't changed i think so you might consider upgrading the solrJ client in Nutch to 3.6. Check the ivy/ivy.xml in your Nutch source or copy the proper SolrJ client jars to the lib/ directory in case you only have a binary distribution.
 
-----Original message-----
> From:devang pandey <de...@gmail.com>
> Sent: Monday 8th July 2013 12:20
> To: user@nutch.apache.org
> Subject: Re: nutch 1.2 solr 3.6 integration issue
> 
> thanx markus for providing valuable insight .Currently i need to continue
> working on this only .So could you please suggest if anything could be done
> to support this version of nutch.
> 
> 
> On Mon, Jul 8, 2013 at 3:46 PM, Markus Jelsma <ma...@openindex.io>wrote:
> 
> > Ah, since you're using an old Nutch and an old SolrJ client and that the
> > Javabin format has changed over time, i think your Solr is too new for the
> > client. I'd advice to upgrade to Nutch 1.8 if you can.
> >
> >
> >
> > -----Original message-----
> > > From:devang pandey <de...@gmail.com>
> > > Sent: Monday 8th July 2013 12:13
> > > To: user@nutch.apache.org
> > > Subject: Re: nutch 1.2 solr 3.6 integration issue
> > >
> > > Hey.. This is my hadoop log file
> > > ava.lang.RuntimeException: Invalid version (expected 2, but 60) or the
> > data
> > > in not in 'javabin' format
> > > at
> > org.apache.solr.common.util.JavaBinCodec.unmarshal(JavaBinCodec.java:99)
> > > at
> > >
> > org.apache.solr.client.solrj.impl.BinaryResponseParser.processResponse(BinaryResponseParser.java:41)
> > > at
> > >
> > org.apache.solr.client.solrj.impl.CommonsHttpSolrServer.request(CommonsHttpSolrServer.java:469)
> > > at
> > >
> > org.apache.solr.client.solrj.impl.CommonsHttpSolrServer.request(CommonsHttpSolrServer.java:249)
> > > at
> > >
> > org.apache.solr.client.solrj.request.AbstractUpdateRequest.process(AbstractUpdateRequest.java:105)
> > > at org.apache.solr.client.solrj.SolrServer.add(SolrServer.java:69)
> > > at org.apache.solr.client.solrj.SolrServer.add(SolrServer.java:54)
> > > at org.apache.nutch.indexer.solr.SolrWriter.close(SolrWriter.java:75)
> > > at
> > >
> > org.apache.nutch.indexer.IndexerOutputFormat$1.close(IndexerOutputFormat.java:48)
> > > at org.apache.hadoop.mapred.ReduceTask.runOldReducer(ReduceTask.java:474)
> > > at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:411)
> > > at
> > org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:216)
> > > 2013-07-08 15:17:39,539 ERROR solr.SolrIndexer - java.io.IOException: Job
> > > failed!
> > >
> > >
> > >
> > > On Mon, Jul 8, 2013 at 3:41 PM, Markus Jelsma <
> > markus.jelsma@openindex.io>wrote:
> > >
> > > > hi
> > > >
> > > > you still haven't provided the logs.
> > > >
> > > >
> > > > -----Original message-----
> > > > > From:devang pandey <de...@gmail.com>
> > > > > Sent: Monday 8th July 2013 12:08
> > > > > To: user@nutch.apache.org
> > > > > Subject: Re: nutch 1.2 solr 3.6 integration issue
> > > > >
> > > > > hey markus .Tried your solution but its still not working . can you
> > pls
> > > > > suggest me some other way of resolving this issue.
> > > > >
> > > > >
> > > > > On Mon, Jul 8, 2013 at 3:14 PM, Markus Jelsma <
> > > > markus.jelsma@openindex.io>wrote:
> > > > >
> > > > > > You need to provide the log output. But i think crawl/segments/*
> > is the
> > > > > > problem. You must either do seg1 seg2 seg3 or -dir segments/. No
> > > > wildcards
> > > > > > supported!
> > > > > >
> > > > > > Cheers
> > > > > >
> > > > > >
> > > > > >
> > > > > > -----Original message-----
> > > > > > > From:devang pandey <de...@gmail.com>
> > > > > > > Sent: Monday 8th July 2013 11:41
> > > > > > > To: user@nutch.apache.org
> > > > > > > Subject: nutch 1.2 solr 3.6 integration issue
> > > > > > >
> > > > > > > I have crawled a site successfully using NUTCH 1.2 .Now I want to
> > > > > > integrate
> > > > > > > this with solr 3.6 . Problem is when I am issuing command $
> > bin/nutch
> > > > > > > solrindex http://localhost:8080/solr/ crawl/crawldb
> > crawl/linkdb cra
> > > > > > > wl/segments/* an error occurs
> > > > > > >
> > > > > > > SolrIndexer: starting at 2013-07-08 14:52:27
> > java.io.IOException: Job
> > > > > > > failed!
> > > > > > >
> > > > > > > Please help me to solve this issue
> > > > > > >
> > > > > > >
> > > > > > > Thankyou
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> 

RE: nutch 1.2 solr 3.6 integration issue

Posted by Markus Jelsma <ma...@openindex.io>.
hi

you still haven't provided the logs.

 
-----Original message-----
> From:devang pandey <de...@gmail.com>
> Sent: Monday 8th July 2013 12:08
> To: user@nutch.apache.org
> Subject: Re: nutch 1.2 solr 3.6 integration issue
> 
> hey markus .Tried your solution but its still not working . can you pls
> suggest me some other way of resolving this issue.
> 
> 
> On Mon, Jul 8, 2013 at 3:14 PM, Markus Jelsma <ma...@openindex.io>wrote:
> 
> > You need to provide the log output. But i think crawl/segments/* is the
> > problem. You must either do seg1 seg2 seg3 or -dir segments/. No wildcards
> > supported!
> >
> > Cheers
> >
> >
> >
> > -----Original message-----
> > > From:devang pandey <de...@gmail.com>
> > > Sent: Monday 8th July 2013 11:41
> > > To: user@nutch.apache.org
> > > Subject: nutch 1.2 solr 3.6 integration issue
> > >
> > > I have crawled a site successfully using NUTCH 1.2 .Now I want to
> > integrate
> > > this with solr 3.6 . Problem is when I am issuing command $ bin/nutch
> > > solrindex http://localhost:8080/solr/ crawl/crawldb crawl/linkdb cra
> > > wl/segments/* an error occurs
> > >
> > > SolrIndexer: starting at 2013-07-08 14:52:27 java.io.IOException: Job
> > > failed!
> > >
> > > Please help me to solve this issue
> > >
> > >
> > > Thankyou
> > >
> >
> 

RE: nutch 1.2 solr 3.6 integration issue

Posted by Markus Jelsma <ma...@openindex.io>.
you're building nutch from source? 
 
-----Original message-----
> From:devang pandey <de...@gmail.com>
> Sent: Monday 8th July 2013 12:36
> To: user@nutch.apache.org
> Subject: Re: nutch 1.2 solr 3.6 integration issue
> 
> Markus, I copied solrJ client jars from solr/dist to nutch lib and removed
> the older ones ... I also carried an ant job ..But still this stuff is not
> working .
> 
> 
> On Mon, Jul 8, 2013 at 3:54 PM, Markus Jelsma <ma...@openindex.io>wrote:
> 
> > Well, the API hasn't changed i think so you might consider upgrading the
> > solrJ client in Nutch to 3.6. Check the ivy/ivy.xml in your Nutch source or
> > copy the proper SolrJ client jars to the lib/ directory in case you only
> > have a binary distribution.
> >
> > -----Original message-----
> > > From:devang pandey <de...@gmail.com>
> > > Sent: Monday 8th July 2013 12:20
> > > To: user@nutch.apache.org
> > > Subject: Re: nutch 1.2 solr 3.6 integration issue
> > >
> > > thanx markus for providing valuable insight .Currently i need to continue
> > > working on this only .So could you please suggest if anything could be
> > done
> > > to support this version of nutch.
> > >
> > >
> > > On Mon, Jul 8, 2013 at 3:46 PM, Markus Jelsma <
> > markus.jelsma@openindex.io>wrote:
> > >
> > > > Ah, since you're using an old Nutch and an old SolrJ client and that
> > the
> > > > Javabin format has changed over time, i think your Solr is too new for
> > the
> > > > client. I'd advice to upgrade to Nutch 1.8 if you can.
> > > >
> > > >
> > > >
> > > > -----Original message-----
> > > > > From:devang pandey <de...@gmail.com>
> > > > > Sent: Monday 8th July 2013 12:13
> > > > > To: user@nutch.apache.org
> > > > > Subject: Re: nutch 1.2 solr 3.6 integration issue
> > > > >
> > > > > Hey.. This is my hadoop log file
> > > > > ava.lang.RuntimeException: Invalid version (expected 2, but 60) or
> > the
> > > > data
> > > > > in not in 'javabin' format
> > > > > at
> > > >
> > org.apache.solr.common.util.JavaBinCodec.unmarshal(JavaBinCodec.java:99)
> > > > > at
> > > > >
> > > >
> > org.apache.solr.client.solrj.impl.BinaryResponseParser.processResponse(BinaryResponseParser.java:41)
> > > > > at
> > > > >
> > > >
> > org.apache.solr.client.solrj.impl.CommonsHttpSolrServer.request(CommonsHttpSolrServer.java:469)
> > > > > at
> > > > >
> > > >
> > org.apache.solr.client.solrj.impl.CommonsHttpSolrServer.request(CommonsHttpSolrServer.java:249)
> > > > > at
> > > > >
> > > >
> > org.apache.solr.client.solrj.request.AbstractUpdateRequest.process(AbstractUpdateRequest.java:105)
> > > > > at org.apache.solr.client.solrj.SolrServer.add(SolrServer.java:69)
> > > > > at org.apache.solr.client.solrj.SolrServer.add(SolrServer.java:54)
> > > > > at org.apache.nutch.indexer.solr.SolrWriter.close(SolrWriter.java:75)
> > > > > at
> > > > >
> > > >
> > org.apache.nutch.indexer.IndexerOutputFormat$1.close(IndexerOutputFormat.java:48)
> > > > > at
> > org.apache.hadoop.mapred.ReduceTask.runOldReducer(ReduceTask.java:474)
> > > > > at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:411)
> > > > > at
> > > >
> > org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:216)
> > > > > 2013-07-08 15:17:39,539 ERROR solr.SolrIndexer -
> > java.io.IOException: Job
> > > > > failed!
> > > > >
> > > > >
> > > > >
> > > > > On Mon, Jul 8, 2013 at 3:41 PM, Markus Jelsma <
> > > > markus.jelsma@openindex.io>wrote:
> > > > >
> > > > > > hi
> > > > > >
> > > > > > you still haven't provided the logs.
> > > > > >
> > > > > >
> > > > > > -----Original message-----
> > > > > > > From:devang pandey <de...@gmail.com>
> > > > > > > Sent: Monday 8th July 2013 12:08
> > > > > > > To: user@nutch.apache.org
> > > > > > > Subject: Re: nutch 1.2 solr 3.6 integration issue
> > > > > > >
> > > > > > > hey markus .Tried your solution but its still not working . can
> > you
> > > > pls
> > > > > > > suggest me some other way of resolving this issue.
> > > > > > >
> > > > > > >
> > > > > > > On Mon, Jul 8, 2013 at 3:14 PM, Markus Jelsma <
> > > > > > markus.jelsma@openindex.io>wrote:
> > > > > > >
> > > > > > > > You need to provide the log output. But i think
> > crawl/segments/*
> > > > is the
> > > > > > > > problem. You must either do seg1 seg2 seg3 or -dir segments/.
> > No
> > > > > > wildcards
> > > > > > > > supported!
> > > > > > > >
> > > > > > > > Cheers
> > > > > > > >
> > > > > > > >
> > > > > > > >
> > > > > > > > -----Original message-----
> > > > > > > > > From:devang pandey <de...@gmail.com>
> > > > > > > > > Sent: Monday 8th July 2013 11:41
> > > > > > > > > To: user@nutch.apache.org
> > > > > > > > > Subject: nutch 1.2 solr 3.6 integration issue
> > > > > > > > >
> > > > > > > > > I have crawled a site successfully using NUTCH 1.2 .Now I
> > want to
> > > > > > > > integrate
> > > > > > > > > this with solr 3.6 . Problem is when I am issuing command $
> > > > bin/nutch
> > > > > > > > > solrindex http://localhost:8080/solr/ crawl/crawldb
> > > > crawl/linkdb cra
> > > > > > > > > wl/segments/* an error occurs
> > > > > > > > >
> > > > > > > > > SolrIndexer: starting at 2013-07-08 14:52:27
> > > > java.io.IOException: Job
> > > > > > > > > failed!
> > > > > > > > >
> > > > > > > > > Please help me to solve this issue
> > > > > > > > >
> > > > > > > > >
> > > > > > > > > Thankyou
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> 

RE: nutch 1.2 solr 3.6 integration issue

Posted by Markus Jelsma <ma...@openindex.io>.
i am not sure anymore when the javabin version changed. You can try but it's far from ideal. 
 
-----Original message-----
> From:devang pandey <de...@gmail.com>
> Sent: Monday 8th July 2013 12:49
> To: user@nutch.apache.org
> Subject: Re: nutch 1.2 solr 3.6 integration issue
> 
> can i downgrade my solr to 3.1 ?? will this work
> 
> 
> On Mon, Jul 8, 2013 at 4:16 PM, Markus Jelsma <ma...@openindex.io>wrote:
> 
> > mm, if dropping jars doesn't work (yields the same error in the logs) i
> > don't know what you could do except upgrading to a more recent Nutch. 1.5
> > or higher should work.
> >
> >
> > -----Original message-----
> > > From:devang pandey <de...@gmail.com>
> > > Sent: Monday 8th July 2013 12:45
> > > To: user@nutch.apache.org
> > > Subject: Re: nutch 1.2 solr 3.6 integration issue
> > >
> > > I am very sorry for wrong reply . I am using binary .
> > >
> > >
> > > On Mon, Jul 8, 2013 at 4:10 PM, Markus Jelsma <
> > markus.jelsma@openindex.io>wrote:
> > >
> > > > you're building nutch from source?
> > > >
> > > > -----Original message-----
> > > > > From:devang pandey <de...@gmail.com>
> > > > > Sent: Monday 8th July 2013 12:36
> > > > > To: user@nutch.apache.org
> > > > > Subject: Re: nutch 1.2 solr 3.6 integration issue
> > > > >
> > > > > Markus, I copied solrJ client jars from solr/dist to nutch lib and
> > > > removed
> > > > > the older ones ... I also carried an ant job ..But still this stuff
> > is
> > > > not
> > > > > working .
> > > > >
> > > > >
> > > > > On Mon, Jul 8, 2013 at 3:54 PM, Markus Jelsma <
> > > > markus.jelsma@openindex.io>wrote:
> > > > >
> > > > > > Well, the API hasn't changed i think so you might consider
> > upgrading
> > > > the
> > > > > > solrJ client in Nutch to 3.6. Check the ivy/ivy.xml in your Nutch
> > > > source or
> > > > > > copy the proper SolrJ client jars to the lib/ directory in case you
> > > > only
> > > > > > have a binary distribution.
> > > > > >
> > > > > > -----Original message-----
> > > > > > > From:devang pandey <de...@gmail.com>
> > > > > > > Sent: Monday 8th July 2013 12:20
> > > > > > > To: user@nutch.apache.org
> > > > > > > Subject: Re: nutch 1.2 solr 3.6 integration issue
> > > > > > >
> > > > > > > thanx markus for providing valuable insight .Currently i need to
> > > > continue
> > > > > > > working on this only .So could you please suggest if anything
> > could
> > > > be
> > > > > > done
> > > > > > > to support this version of nutch.
> > > > > > >
> > > > > > >
> > > > > > > On Mon, Jul 8, 2013 at 3:46 PM, Markus Jelsma <
> > > > > > markus.jelsma@openindex.io>wrote:
> > > > > > >
> > > > > > > > Ah, since you're using an old Nutch and an old SolrJ client and
> > > > that
> > > > > > the
> > > > > > > > Javabin format has changed over time, i think your Solr is too
> > new
> > > > for
> > > > > > the
> > > > > > > > client. I'd advice to upgrade to Nutch 1.8 if you can.
> > > > > > > >
> > > > > > > >
> > > > > > > >
> > > > > > > > -----Original message-----
> > > > > > > > > From:devang pandey <de...@gmail.com>
> > > > > > > > > Sent: Monday 8th July 2013 12:13
> > > > > > > > > To: user@nutch.apache.org
> > > > > > > > > Subject: Re: nutch 1.2 solr 3.6 integration issue
> > > > > > > > >
> > > > > > > > > Hey.. This is my hadoop log file
> > > > > > > > > ava.lang.RuntimeException: Invalid version (expected 2, but
> > 60)
> > > > or
> > > > > > the
> > > > > > > > data
> > > > > > > > > in not in 'javabin' format
> > > > > > > > > at
> > > > > > > >
> > > > > >
> > > >
> > org.apache.solr.common.util.JavaBinCodec.unmarshal(JavaBinCodec.java:99)
> > > > > > > > > at
> > > > > > > > >
> > > > > > > >
> > > > > >
> > > >
> > org.apache.solr.client.solrj.impl.BinaryResponseParser.processResponse(BinaryResponseParser.java:41)
> > > > > > > > > at
> > > > > > > > >
> > > > > > > >
> > > > > >
> > > >
> > org.apache.solr.client.solrj.impl.CommonsHttpSolrServer.request(CommonsHttpSolrServer.java:469)
> > > > > > > > > at
> > > > > > > > >
> > > > > > > >
> > > > > >
> > > >
> > org.apache.solr.client.solrj.impl.CommonsHttpSolrServer.request(CommonsHttpSolrServer.java:249)
> > > > > > > > > at
> > > > > > > > >
> > > > > > > >
> > > > > >
> > > >
> > org.apache.solr.client.solrj.request.AbstractUpdateRequest.process(AbstractUpdateRequest.java:105)
> > > > > > > > > at
> > > > org.apache.solr.client.solrj.SolrServer.add(SolrServer.java:69)
> > > > > > > > > at
> > > > org.apache.solr.client.solrj.SolrServer.add(SolrServer.java:54)
> > > > > > > > > at
> > > > org.apache.nutch.indexer.solr.SolrWriter.close(SolrWriter.java:75)
> > > > > > > > > at
> > > > > > > > >
> > > > > > > >
> > > > > >
> > > >
> > org.apache.nutch.indexer.IndexerOutputFormat$1.close(IndexerOutputFormat.java:48)
> > > > > > > > > at
> > > > > >
> > org.apache.hadoop.mapred.ReduceTask.runOldReducer(ReduceTask.java:474)
> > > > > > > > > at
> > org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:411)
> > > > > > > > > at
> > > > > > > >
> > > > > >
> > > >
> > org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:216)
> > > > > > > > > 2013-07-08 15:17:39,539 ERROR solr.SolrIndexer -
> > > > > > java.io.IOException: Job
> > > > > > > > > failed!
> > > > > > > > >
> > > > > > > > >
> > > > > > > > >
> > > > > > > > > On Mon, Jul 8, 2013 at 3:41 PM, Markus Jelsma <
> > > > > > > > markus.jelsma@openindex.io>wrote:
> > > > > > > > >
> > > > > > > > > > hi
> > > > > > > > > >
> > > > > > > > > > you still haven't provided the logs.
> > > > > > > > > >
> > > > > > > > > >
> > > > > > > > > > -----Original message-----
> > > > > > > > > > > From:devang pandey <de...@gmail.com>
> > > > > > > > > > > Sent: Monday 8th July 2013 12:08
> > > > > > > > > > > To: user@nutch.apache.org
> > > > > > > > > > > Subject: Re: nutch 1.2 solr 3.6 integration issue
> > > > > > > > > > >
> > > > > > > > > > > hey markus .Tried your solution but its still not
> > working .
> > > > can
> > > > > > you
> > > > > > > > pls
> > > > > > > > > > > suggest me some other way of resolving this issue.
> > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > > > On Mon, Jul 8, 2013 at 3:14 PM, Markus Jelsma <
> > > > > > > > > > markus.jelsma@openindex.io>wrote:
> > > > > > > > > > >
> > > > > > > > > > > > You need to provide the log output. But i think
> > > > > > crawl/segments/*
> > > > > > > > is the
> > > > > > > > > > > > problem. You must either do seg1 seg2 seg3 or -dir
> > > > segments/.
> > > > > > No
> > > > > > > > > > wildcards
> > > > > > > > > > > > supported!
> > > > > > > > > > > >
> > > > > > > > > > > > Cheers
> > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > > > -----Original message-----
> > > > > > > > > > > > > From:devang pandey <de...@gmail.com>
> > > > > > > > > > > > > Sent: Monday 8th July 2013 11:41
> > > > > > > > > > > > > To: user@nutch.apache.org
> > > > > > > > > > > > > Subject: nutch 1.2 solr 3.6 integration issue
> > > > > > > > > > > > >
> > > > > > > > > > > > > I have crawled a site successfully using NUTCH 1.2
> > .Now I
> > > > > > want to
> > > > > > > > > > > > integrate
> > > > > > > > > > > > > this with solr 3.6 . Problem is when I am issuing
> > > > command $
> > > > > > > > bin/nutch
> > > > > > > > > > > > > solrindex http://localhost:8080/solr/ crawl/crawldb
> > > > > > > > crawl/linkdb cra
> > > > > > > > > > > > > wl/segments/* an error occurs
> > > > > > > > > > > > >
> > > > > > > > > > > > > SolrIndexer: starting at 2013-07-08 14:52:27
> > > > > > > > java.io.IOException: Job
> > > > > > > > > > > > > failed!
> > > > > > > > > > > > >
> > > > > > > > > > > > > Please help me to solve this issue
> > > > > > > > > > > > >
> > > > > > > > > > > > >
> > > > > > > > > > > > > Thankyou
> > > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
>