You are viewing a plain text version of this content. The canonical link for it is here.
Posted to solr-user@lucene.apache.org by Manikandan Saravanan <ma...@thesocialpeople.net> on 2014/05/26 20:20:11 UTC
Solr Deduplicate - Class Not Found Exception
Hi,
I’m running Nutch 2 on a Hadoop 1.2.1 cluster with 2 nodes. I’m running Solr 4 separately on a box and I replaced Solr’s schema with Nutch’s Solr-4 schema. When I run a crawl, I get the following error at the end of the job
14/05/26 14:08:32 INFO solr.SolrDeleteDuplicates: SolrDeleteDuplicates: starting...
14/05/26 14:08:32 INFO solr.SolrDeleteDuplicates: SolrDeleteDuplicates: Solr url: http://10.130.231.16:8983/solr/nutch
14/05/26 14:08:33 WARN mapred.JobClient: No job jar file set. User classes may not be found. See JobConf(Class) or JobConf#setJar(String).
14/05/26 14:08:33 INFO mapred.JobClient: Running job: job_201405261214_0014
14/05/26 14:08:34 INFO mapred.JobClient: map 0% reduce 0%
14/05/26 14:08:43 INFO mapred.JobClient: Task Id : attempt_201405261214_0014_m_000000_0, Status : FAILED
java.lang.RuntimeException: java.lang.ClassNotFoundException: org.apache.nutch.indexer.solr.SolrDeleteDuplicates$SolrInputFormat
at org.apache.hadoop.conf.Configuration.getClass(Configuration.java:857)
at org.apache.hadoop.mapreduce.JobContext.getInputFormatClass(JobContext.java:187)
at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:722)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:364)
at org.apache.hadoop.mapred.Child$4.run(Child.java:255)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:415)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1190)
at org.apache.hadoop.mapred.Child.main(Child.java:249)
Caused by: java.lang.ClassNotFoundException: org.apache.nutch.indexer.solr.SolrDeleteDuplicates$SolrInputFormat
at java.net.URLClassLoader$1.run(URLClassLoader.java:366)
at java.net.URLClassLoader$1.run(URLClassLoader.java:355)
at java.security.AccessController.doPrivileged(Native Method)
at java.net.URLClassLoader.findClass(URLClassLoader.java:354)
at java.lang.ClassLoader.loadClass(ClassLoader.java:425)
at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:308)
at java.lang.ClassLoader.loadClass(ClassLoader.java:358)
at java.lang.Class.forName0(Native Method)
at java.lang.Class.forName(Class.java:270)
at org.apache.hadoop.conf.Configuration.getClassByName(Configuration.java:810)
at org.apache.hadoop.conf.Configuration.getClass(Configuration.java:855)
... 8 more
14/05/26 14:08:43 WARN mapred.JobClient: Error reading task outputnutch-two-qontifi
14/05/26 14:08:43 WARN mapred.JobClient: Error reading task outputnutch-two-qontifi
14/05/26 14:08:44 INFO mapred.JobClient: Task Id : attempt_201405261214_0014_m_000001_0, Status : FAILED
java.lang.RuntimeException: java.lang.ClassNotFoundException: org.apache.nutch.indexer.solr.SolrDeleteDuplicates$SolrInputFormat
at org.apache.hadoop.conf.Configuration.getClass(Configuration.java:857)
at org.apache.hadoop.mapreduce.JobContext.getInputFormatClass(JobContext.java:187)
at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:722)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:364)
at org.apache.hadoop.mapred.Child$4.run(Child.java:255)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:415)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1190)
at org.apache.hadoop.mapred.Child.main(Child.java:249)
Caused by: java.lang.ClassNotFoundException: org.apache.nutch.indexer.solr.SolrDeleteDuplicates$SolrInputFormat
at java.net.URLClassLoader$1.run(URLClassLoader.java:366)
at java.net.URLClassLoader$1.run(URLClassLoader.java:355)
at java.security.AccessController.doPrivileged(Native Method)
at java.net.URLClassLoader.findClass(URLClassLoader.java:354)
at java.lang.ClassLoader.loadClass(ClassLoader.java:425)
at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:308)
at java.lang.ClassLoader.loadClass(ClassLoader.java:358)
at java.lang.Class.forName0(Native Method)
at java.lang.Class.forName(Class.java:270)
at org.apache.hadoop.conf.Configuration.getClassByName(Configuration.java:810)
at org.apache.hadoop.conf.Configuration.getClass(Configuration.java:855)
... 8 more
14/05/26 14:08:50 INFO mapred.JobClient: Task Id : attempt_201405261214_0014_m_000001_1, Status : FAILED
java.lang.RuntimeException: java.lang.ClassNotFoundException: org.apache.nutch.indexer.solr.SolrDeleteDuplicates$SolrInputFormat
at org.apache.hadoop.conf.Configuration.getClass(Configuration.java:857)
at org.apache.hadoop.mapreduce.JobContext.getInputFormatClass(JobContext.java:187)
at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:722)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:364)
at org.apache.hadoop.mapred.Child$4.run(Child.java:255)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:415)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1190)
at org.apache.hadoop.mapred.Child.main(Child.java:249)
Caused by: java.lang.ClassNotFoundException: org.apache.nutch.indexer.solr.SolrDeleteDuplicates$SolrInputFormat
at java.net.URLClassLoader$1.run(URLClassLoader.java:366)
at java.net.URLClassLoader$1.run(URLClassLoader.java:355)
at java.security.AccessController.doPrivileged(Native Method)
at java.net.URLClassLoader.findClass(URLClassLoader.java:354)
at java.lang.ClassLoader.loadClass(ClassLoader.java:425)
at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:308)
at java.lang.ClassLoader.loadClass(ClassLoader.java:358)
at java.lang.Class.forName0(Native Method)
at java.lang.Class.forName(Class.java:270)
at org.apache.hadoop.conf.Configuration.getClassByName(Configuration.java:810)
at org.apache.hadoop.conf.Configuration.getClass(Configuration.java:855)
... 8 more
14/05/26 14:08:50 WARN mapred.JobClient: Error reading task outputnutch-two-qontifi
14/05/26 14:08:50 WARN mapred.JobClient: Error reading task outputnutch-two-qontifi
14/05/26 14:08:51 INFO mapred.JobClient: Task Id : attempt_201405261214_0014_m_000000_1, Status : FAILED
java.lang.RuntimeException: java.lang.ClassNotFoundException: org.apache.nutch.indexer.solr.SolrDeleteDuplicates$SolrInputFormat
at org.apache.hadoop.conf.Configuration.getClass(Configuration.java:857)
at org.apache.hadoop.mapreduce.JobContext.getInputFormatClass(JobContext.java:187)
at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:722)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:364)
at org.apache.hadoop.mapred.Child$4.run(Child.java:255)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:415)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1190)
at org.apache.hadoop.mapred.Child.main(Child.java:249)
Caused by: java.lang.ClassNotFoundException: org.apache.nutch.indexer.solr.SolrDeleteDuplicates$SolrInputFormat
at java.net.URLClassLoader$1.run(URLClassLoader.java:366)
at java.net.URLClassLoader$1.run(URLClassLoader.java:355)
at java.security.AccessController.doPrivileged(Native Method)
at java.net.URLClassLoader.findClass(URLClassLoader.java:354)
at java.lang.ClassLoader.loadClass(ClassLoader.java:425)
at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:308)
at java.lang.ClassLoader.loadClass(ClassLoader.java:358)
at java.lang.Class.forName0(Native Method)
at java.lang.Class.forName(Class.java:270)
at org.apache.hadoop.conf.Configuration.getClassByName(Configuration.java:810)
at org.apache.hadoop.conf.Configuration.getClass(Configuration.java:855)
... 8 more
14/05/26 14:08:56 INFO mapred.JobClient: Task Id : attempt_201405261214_0014_m_000001_2, Status : FAILED
java.lang.RuntimeException: java.lang.ClassNotFoundException: org.apache.nutch.indexer.solr.SolrDeleteDuplicates$SolrInputFormat
at org.apache.hadoop.conf.Configuration.getClass(Configuration.java:857)
at org.apache.hadoop.mapreduce.JobContext.getInputFormatClass(JobContext.java:187)
at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:722)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:364)
at org.apache.hadoop.mapred.Child$4.run(Child.java:255)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:415)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1190)
at org.apache.hadoop.mapred.Child.main(Child.java:249)
Caused by: java.lang.ClassNotFoundException: org.apache.nutch.indexer.solr.SolrDeleteDuplicates$SolrInputFormat
at java.net.URLClassLoader$1.run(URLClassLoader.java:366)
at java.net.URLClassLoader$1.run(URLClassLoader.java:355)
at java.security.AccessController.doPrivileged(Native Method)
at java.net.URLClassLoader.findClass(URLClassLoader.java:354)
at java.lang.ClassLoader.loadClass(ClassLoader.java:425)
at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:308)
at java.lang.ClassLoader.loadClass(ClassLoader.java:358)
at java.lang.Class.forName0(Native Method)
at java.lang.Class.forName(Class.java:270)
at org.apache.hadoop.conf.Configuration.getClassByName(Configuration.java:810)
at org.apache.hadoop.conf.Configuration.getClass(Configuration.java:855)
... 8 more
14/05/26 14:08:56 WARN mapred.JobClient: Error reading task outputnutch-two-qontifi
14/05/26 14:08:56 WARN mapred.JobClient: Error reading task outputnutch-two-qontifi
14/05/26 14:08:57 INFO mapred.JobClient: Task Id : attempt_201405261214_0014_m_000000_2, Status : FAILED
java.lang.RuntimeException: java.lang.ClassNotFoundException: org.apache.nutch.indexer.solr.SolrDeleteDuplicates$SolrInputFormat
at org.apache.hadoop.conf.Configuration.getClass(Configuration.java:857)
at org.apache.hadoop.mapreduce.JobContext.getInputFormatClass(JobContext.java:187)
at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:722)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:364)
at org.apache.hadoop.mapred.Child$4.run(Child.java:255)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:415)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1190)
at org.apache.hadoop.mapred.Child.main(Child.java:249)
Caused by: java.lang.ClassNotFoundException: org.apache.nutch.indexer.solr.SolrDeleteDuplicates$SolrInputFormat
at java.net.URLClassLoader$1.run(URLClassLoader.java:366)
at java.net.URLClassLoader$1.run(URLClassLoader.java:355)
at java.security.AccessController.doPrivileged(Native Method)
at java.net.URLClassLoader.findClass(URLClassLoader.java:354)
at java.lang.ClassLoader.loadClass(ClassLoader.java:425)
at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:308)
at java.lang.ClassLoader.loadClass(ClassLoader.java:358)
at java.lang.Class.forName0(Native Method)
at java.lang.Class.forName(Class.java:270)
at org.apache.hadoop.conf.Configuration.getClassByName(Configuration.java:810)
at org.apache.hadoop.conf.Configuration.getClass(Configuration.java:855)
And finally, nothing is being indexed to Solr. Am I missing anything? Please help.
Regards,
Manikandan
Re: Solr Deduplicate - Class Not Found Exception
Posted by Julien Nioche <li...@gmail.com>.
Hi Manikandan
I presume you are running it from runtime/deploy/bin. Have you tried
looking into the job jar file in runtime/deploy to make sure the class is
in the archive?
Julien
On 26 May 2014 19:20, Manikandan Saravanan
<ma...@thesocialpeople.net>wrote:
> Hi,
>
> I’m running Nutch 2 on a Hadoop 1.2.1 cluster with 2 nodes. I’m running
> Solr 4 separately on a box and I replaced Solr’s schema with Nutch’s Solr-4
> schema. When I run a crawl, I get the following error at the end of the job
>
> 14/05/26 14:08:32 INFO solr.SolrDeleteDuplicates: SolrDeleteDuplicates:
> starting...
> 14/05/26 14:08:32 INFO solr.SolrDeleteDuplicates: SolrDeleteDuplicates:
> Solr url: http://10.130.231.16:8983/solr/nutch
> 14/05/26 14:08:33 WARN mapred.JobClient: No job jar file set. User
> classes may not be found. See JobConf(Class) or JobConf#setJar(String).
> 14/05/26 14:08:33 INFO mapred.JobClient: Running job: job_201405261214_0014
> 14/05/26 14:08:34 INFO mapred.JobClient: map 0% reduce 0%
> 14/05/26 14:08:43 INFO mapred.JobClient: Task Id :
> attempt_201405261214_0014_m_000000_0, Status : FAILED
> java.lang.RuntimeException: java.lang.ClassNotFoundException:
> org.apache.nutch.indexer.solr.SolrDeleteDuplicates$SolrInputFormat
> at
> org.apache.hadoop.conf.Configuration.getClass(Configuration.java:857)
> at
> org.apache.hadoop.mapreduce.JobContext.getInputFormatClass(JobContext.java:187)
> at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:722)
> at org.apache.hadoop.mapred.MapTask.run(MapTask.java:364)
> at org.apache.hadoop.mapred.Child$4.run(Child.java:255)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:415)
> at
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1190)
> at org.apache.hadoop.mapred.Child.main(Child.java:249)
> Caused by: java.lang.ClassNotFoundException:
> org.apache.nutch.indexer.solr.SolrDeleteDuplicates$SolrInputFormat
> at java.net.URLClassLoader$1.run(URLClassLoader.java:366)
> at java.net.URLClassLoader$1.run(URLClassLoader.java:355)
> at java.security.AccessController.doPrivileged(Native Method)
> at java.net.URLClassLoader.findClass(URLClassLoader.java:354)
> at java.lang.ClassLoader.loadClass(ClassLoader.java:425)
> at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:308)
> at java.lang.ClassLoader.loadClass(ClassLoader.java:358)
> at java.lang.Class.forName0(Native Method)
> at java.lang.Class.forName(Class.java:270)
> at
> org.apache.hadoop.conf.Configuration.getClassByName(Configuration.java:810)
> at
> org.apache.hadoop.conf.Configuration.getClass(Configuration.java:855)
> ... 8 more
>
> 14/05/26 14:08:43 WARN mapred.JobClient: Error reading task
> outputnutch-two-qontifi
> 14/05/26 14:08:43 WARN mapred.JobClient: Error reading task
> outputnutch-two-qontifi
> 14/05/26 14:08:44 INFO mapred.JobClient: Task Id :
> attempt_201405261214_0014_m_000001_0, Status : FAILED
> java.lang.RuntimeException: java.lang.ClassNotFoundException:
> org.apache.nutch.indexer.solr.SolrDeleteDuplicates$SolrInputFormat
> at
> org.apache.hadoop.conf.Configuration.getClass(Configuration.java:857)
> at
> org.apache.hadoop.mapreduce.JobContext.getInputFormatClass(JobContext.java:187)
> at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:722)
> at org.apache.hadoop.mapred.MapTask.run(MapTask.java:364)
> at org.apache.hadoop.mapred.Child$4.run(Child.java:255)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:415)
> at
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1190)
> at org.apache.hadoop.mapred.Child.main(Child.java:249)
> Caused by: java.lang.ClassNotFoundException:
> org.apache.nutch.indexer.solr.SolrDeleteDuplicates$SolrInputFormat
> at java.net.URLClassLoader$1.run(URLClassLoader.java:366)
> at java.net.URLClassLoader$1.run(URLClassLoader.java:355)
> at java.security.AccessController.doPrivileged(Native Method)
> at java.net.URLClassLoader.findClass(URLClassLoader.java:354)
> at java.lang.ClassLoader.loadClass(ClassLoader.java:425)
> at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:308)
> at java.lang.ClassLoader.loadClass(ClassLoader.java:358)
> at java.lang.Class.forName0(Native Method)
> at java.lang.Class.forName(Class.java:270)
> at
> org.apache.hadoop.conf.Configuration.getClassByName(Configuration.java:810)
> at
> org.apache.hadoop.conf.Configuration.getClass(Configuration.java:855)
> ... 8 more
>
> 14/05/26 14:08:50 INFO mapred.JobClient: Task Id :
> attempt_201405261214_0014_m_000001_1, Status : FAILED
> java.lang.RuntimeException: java.lang.ClassNotFoundException:
> org.apache.nutch.indexer.solr.SolrDeleteDuplicates$SolrInputFormat
> at
> org.apache.hadoop.conf.Configuration.getClass(Configuration.java:857)
> at
> org.apache.hadoop.mapreduce.JobContext.getInputFormatClass(JobContext.java:187)
> at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:722)
> at org.apache.hadoop.mapred.MapTask.run(MapTask.java:364)
> at org.apache.hadoop.mapred.Child$4.run(Child.java:255)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:415)
> at
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1190)
> at org.apache.hadoop.mapred.Child.main(Child.java:249)
> Caused by: java.lang.ClassNotFoundException:
> org.apache.nutch.indexer.solr.SolrDeleteDuplicates$SolrInputFormat
> at java.net.URLClassLoader$1.run(URLClassLoader.java:366)
> at java.net.URLClassLoader$1.run(URLClassLoader.java:355)
> at java.security.AccessController.doPrivileged(Native Method)
> at java.net.URLClassLoader.findClass(URLClassLoader.java:354)
> at java.lang.ClassLoader.loadClass(ClassLoader.java:425)
> at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:308)
> at java.lang.ClassLoader.loadClass(ClassLoader.java:358)
> at java.lang.Class.forName0(Native Method)
> at java.lang.Class.forName(Class.java:270)
> at
> org.apache.hadoop.conf.Configuration.getClassByName(Configuration.java:810)
> at
> org.apache.hadoop.conf.Configuration.getClass(Configuration.java:855)
> ... 8 more
>
> 14/05/26 14:08:50 WARN mapred.JobClient: Error reading task
> outputnutch-two-qontifi
> 14/05/26 14:08:50 WARN mapred.JobClient: Error reading task
> outputnutch-two-qontifi
> 14/05/26 14:08:51 INFO mapred.JobClient: Task Id :
> attempt_201405261214_0014_m_000000_1, Status : FAILED
> java.lang.RuntimeException: java.lang.ClassNotFoundException:
> org.apache.nutch.indexer.solr.SolrDeleteDuplicates$SolrInputFormat
> at
> org.apache.hadoop.conf.Configuration.getClass(Configuration.java:857)
> at
> org.apache.hadoop.mapreduce.JobContext.getInputFormatClass(JobContext.java:187)
> at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:722)
> at org.apache.hadoop.mapred.MapTask.run(MapTask.java:364)
> at org.apache.hadoop.mapred.Child$4.run(Child.java:255)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:415)
> at
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1190)
> at org.apache.hadoop.mapred.Child.main(Child.java:249)
> Caused by: java.lang.ClassNotFoundException:
> org.apache.nutch.indexer.solr.SolrDeleteDuplicates$SolrInputFormat
> at java.net.URLClassLoader$1.run(URLClassLoader.java:366)
> at java.net.URLClassLoader$1.run(URLClassLoader.java:355)
> at java.security.AccessController.doPrivileged(Native Method)
> at java.net.URLClassLoader.findClass(URLClassLoader.java:354)
> at java.lang.ClassLoader.loadClass(ClassLoader.java:425)
> at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:308)
> at java.lang.ClassLoader.loadClass(ClassLoader.java:358)
> at java.lang.Class.forName0(Native Method)
> at java.lang.Class.forName(Class.java:270)
> at
> org.apache.hadoop.conf.Configuration.getClassByName(Configuration.java:810)
> at
> org.apache.hadoop.conf.Configuration.getClass(Configuration.java:855)
> ... 8 more
>
> 14/05/26 14:08:56 INFO mapred.JobClient: Task Id :
> attempt_201405261214_0014_m_000001_2, Status : FAILED
> java.lang.RuntimeException: java.lang.ClassNotFoundException:
> org.apache.nutch.indexer.solr.SolrDeleteDuplicates$SolrInputFormat
> at
> org.apache.hadoop.conf.Configuration.getClass(Configuration.java:857)
> at
> org.apache.hadoop.mapreduce.JobContext.getInputFormatClass(JobContext.java:187)
> at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:722)
> at org.apache.hadoop.mapred.MapTask.run(MapTask.java:364)
> at org.apache.hadoop.mapred.Child$4.run(Child.java:255)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:415)
> at
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1190)
> at org.apache.hadoop.mapred.Child.main(Child.java:249)
> Caused by: java.lang.ClassNotFoundException:
> org.apache.nutch.indexer.solr.SolrDeleteDuplicates$SolrInputFormat
> at java.net.URLClassLoader$1.run(URLClassLoader.java:366)
> at java.net.URLClassLoader$1.run(URLClassLoader.java:355)
> at java.security.AccessController.doPrivileged(Native Method)
> at java.net.URLClassLoader.findClass(URLClassLoader.java:354)
> at java.lang.ClassLoader.loadClass(ClassLoader.java:425)
> at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:308)
> at java.lang.ClassLoader.loadClass(ClassLoader.java:358)
> at java.lang.Class.forName0(Native Method)
> at java.lang.Class.forName(Class.java:270)
> at
> org.apache.hadoop.conf.Configuration.getClassByName(Configuration.java:810)
> at
> org.apache.hadoop.conf.Configuration.getClass(Configuration.java:855)
> ... 8 more
>
> 14/05/26 14:08:56 WARN mapred.JobClient: Error reading task
> outputnutch-two-qontifi
> 14/05/26 14:08:56 WARN mapred.JobClient: Error reading task
> outputnutch-two-qontifi
> 14/05/26 14:08:57 INFO mapred.JobClient: Task Id :
> attempt_201405261214_0014_m_000000_2, Status : FAILED
> java.lang.RuntimeException: java.lang.ClassNotFoundException:
> org.apache.nutch.indexer.solr.SolrDeleteDuplicates$SolrInputFormat
> at
> org.apache.hadoop.conf.Configuration.getClass(Configuration.java:857)
> at
> org.apache.hadoop.mapreduce.JobContext.getInputFormatClass(JobContext.java:187)
> at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:722)
> at org.apache.hadoop.mapred.MapTask.run(MapTask.java:364)
> at org.apache.hadoop.mapred.Child$4.run(Child.java:255)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:415)
> at
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1190)
> at org.apache.hadoop.mapred.Child.main(Child.java:249)
> Caused by: java.lang.ClassNotFoundException:
> org.apache.nutch.indexer.solr.SolrDeleteDuplicates$SolrInputFormat
> at java.net.URLClassLoader$1.run(URLClassLoader.java:366)
> at java.net.URLClassLoader$1.run(URLClassLoader.java:355)
> at java.security.AccessController.doPrivileged(Native Method)
> at java.net.URLClassLoader.findClass(URLClassLoader.java:354)
> at java.lang.ClassLoader.loadClass(ClassLoader.java:425)
> at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:308)
> at java.lang.ClassLoader.loadClass(ClassLoader.java:358)
> at java.lang.Class.forName0(Native Method)
> at java.lang.Class.forName(Class.java:270)
> at
> org.apache.hadoop.conf.Configuration.getClassByName(Configuration.java:810)
> at
> org.apache.hadoop.conf.Configuration.getClass(Configuration.java:855)
>
> And finally, nothing is being indexed to Solr. Am I missing anything?
> Please help.
>
> Regards,
> Manikandan
--
Open Source Solutions for Text Engineering
http://digitalpebble.blogspot.com/
http://www.digitalpebble.com
http://twitter.com/digitalpebble
Re: Solr Deduplicate - Class Not Found Exception
Posted by Shawn Heisey <so...@elyograg.org>.
On 5/26/2014 12:20 PM, Manikandan Saravanan wrote:
> I’m running Nutch 2 on a Hadoop 1.2.1 cluster with 2 nodes. I’m running Solr 4 separately on a box and I replaced Solr’s schema with Nutch’s Solr-4 schema. When I run a crawl, I get the following error at the end of the job
>
> 14/05/26 14:08:32 INFO solr.SolrDeleteDuplicates: SolrDeleteDuplicates: starting...
> 14/05/26 14:08:32 INFO solr.SolrDeleteDuplicates: SolrDeleteDuplicates: Solr url: http://10.130.231.16:8983/solr/nutch
> 14/05/26 14:08:33 WARN mapred.JobClient: No job jar file set. User classes may not be found. See JobConf(Class) or JobConf#setJar(String).
> 14/05/26 14:08:33 INFO mapred.JobClient: Running job: job_201405261214_0014
> 14/05/26 14:08:34 INFO mapred.JobClient: map 0% reduce 0%
> 14/05/26 14:08:43 INFO mapred.JobClient: Task Id : attempt_201405261214_0014_m_000000_0, Status : FAILED
> java.lang.RuntimeException: java.lang.ClassNotFoundException: org.apache.nutch.indexer.solr.SolrDeleteDuplicates$SolrInputFormat
> at org.apache.hadoop.conf.Configuration.getClass(Configuration.java:857)
> at org.apache.hadoop.mapreduce.JobContext.getInputFormatClass(JobContext.java:187)
> at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:722)
> at org.apache.hadoop.mapred.MapTask.run(MapTask.java:364)
> at org.apache.hadoop.mapred.Child$4.run(Child.java:255)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:415)
> at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1190)
> at org.apache.hadoop.mapred.Child.main(Child.java:249)
> Caused by: java.lang.ClassNotFoundException: org.apache.nutch.indexer.solr.SolrDeleteDuplicates$SolrInputFormat
I am not subscribed to the nutch mailing list, so I have removed that
list from the recipients here.
If you look at the last line that I quoted above, you'll see that the
exception is caused by the inability of Java to locate a class, and that
the class is a Nutch class. I just built Nutch 2.2.1 on my server, and
the strange thing here is that this class seems to be part of the main
apache nutch jar, so I have no idea how you are using nutch without this
class being present.
Because this is a nutch class that is missing and not a Solr class, the
Solr mailing list can't really provide much help.
Thanks,
Shawn
Re: Solr Deduplicate - Class Not Found Exception
Posted by Lewis John Mcgibbney <le...@gmail.com>.
Hi,
On Mon, May 26, 2014 at 11:20 AM, Manikandan Saravanan <
manikandan@thesocialpeople.net> wrote:
>
> I’m running Nutch 2
>
Which version?
Do you have the code packaged in to the .job jar?
You need to look in there and see? It seems that it is not there.