You are viewing a plain text version of this content. The canonical link for it is here.
Posted to solr-user@lucene.apache.org by Antony Steiner <an...@gmail.com> on 2012/11/08 10:52:48 UTC

Apache Nutch 1.5.1 + Apache Solr 4.0

Hello my name is Antony and I'm new to apache nutch and solr.

I want to crawl my website and therefore I downloaded nutch to do this.
This works fine. But no I would like to integrate nutch with solr. Im
running this on my unix system.
Im trying to follow this tutorial:
http://wiki.apache.org/nutch/NutchTutorial
But it wont for me. Running Solr without nutch is no problem. I can post
documents to solr with post.jar. But what I want to do is post my nutch
crawl to solr.
Now if I copy the schema.xml from nutch to
apache-solr-4.0.0/example/solr/collection1/conf directory aned restart solr
(java -jar start.jar), I get compiling errors but Solr will start. (Is this
the correct directory to copy my schema?)

Nov 8, 2012 9:40:33 AM org.apache.solr.schema.IndexSchema readSchema
INFO: Schema name=nutch
Nov 8, 2012 9:40:33 AM org.apache.solr.core.CoreContainer create
SEVERE: Unable to create core: collection1
org.apache.solr.common.SolrException: Schema Parsing Failed: multiple points
        at
org.apache.solr.schema.IndexSchema.readSchema(IndexSchema.java:571)
        at org.apache.solr.schema.IndexSchema.<init>(IndexSchema.java:113)
...

Nov 8, 2012 9:40:33 AM org.apache.solr.common.SolrException log
SEVERE: null:org.apache.solr.common.SolrException: Schema Parsing Failed:
multiple points
        at
org.apache.solr.schema.IndexSchema.readSchema(IndexSchema.java:571)
        at org.apache.solr.schema.IndexSchema.<init>(IndexSchema.java:113)
        at org.apache.solr.core.CoreContainer.create(CoreContainer.java:846)
...

Now if I don't copy the schema and push my nutch crawl to solr I get
following error:

SolrIndexer: starting at 2012-11-08 10:49:02
Indexing 5 documents
java.io.IOException: Job failed!
SolrDeleteDuplicates: starting at 2012-11-08 10:49:47
SolrDeleteDuplicates: Solr url: http://photon:8983/solr/

And this is taken from the logging:
org.apache.solr.common.SolrException: ERROR: [doc=
http://e-docs/infrastructure/cpuload_monitor.html] unknown field 'host'

What should I do or what am I missing?

I hope you can help me
Best Regards
Antony

Re: Apache Nutch 1.5.1 + Apache Solr 4.0

Posted by Dave Meikle <lo...@gmail.com>.
Hi,

On 8 Nov 2012, at 15:00, Markus Jelsma <ma...@openindex.io> wrote:

> Hm, i copied the schema from Nutch' trunk verbatim and only had to change the stemmer.  It seems like you have, for some reason, a float with an extra point dangling around somewhere. Can you check?

Just building a Nutch 1.5.1 environment and found this too.  It is actually the version number in the schema.xml[1] and schema-solr4.xml[2]'s for the 1.5.1 branch that is the problem.

In these file the version number reads:
<schema name="nutch" version="1.5.1">

Whereas in trunk[3] it is:
<schema name="nutch" version="1.5">

Obviously as the field is read as a float in the IndexSchema class 1.5.1 will fail due to the extra float.  A quick change back to 1.5 in the file should solve things.

Cheers,
Dave

[1] http://svn.apache.org/repos/asf/nutch/branches/branch-1.5.1/conf/schema.xml
[2] http://svn.apache.org/repos/asf/nutch/branches/branch-1.5.1/conf/schema-solr4.xml
[3] http://svn.apache.org/repos/asf/nutch/trunk/conf/schema.xml




Re: Apache Nutch 1.5.1 + Apache Solr 4.0

Posted by John Whelan <wh...@gmail.com>.
Hi,

I while back, I had the same 'problem'. After solving it for myself, I
built and distributed a combination of Solr and Nutch into a pre-configured
environment. While what I did was specific to Windows (I included Cygwin in
the distribution, and a bunch of other stuff for easy administration of
Nutch.), it might provide a reference for you in configuring Nutch with
Solr. (It also included a XSL transform for displaying Solr results like
Nutch used to display in the old days, which you might want.)

Anyhow, if you're interested, the information at getting
the distribution can be found at
http://www.whelanlabs.com/home/programming-projects/search-engine

Regards,
John

RE: Apache Nutch 1.5.1 + Apache Solr 4.0

Posted by Markus Jelsma <ma...@openindex.io>.
Hm, i copied the schema from Nutch' trunk verbatim and only had to change the stemmer.  It seems like you have, for some reason, a float with an extra point dangling around somewhere. Can you check?
 
-----Original message-----
> From:Antony Steiner <an...@gmail.com>
> Sent: Thu 08-Nov-2012 15:54
> To: Markus Jelsma <ma...@openindex.io>; solr-user@lucene.apache.org
> Subject: Re: Apache Nutch 1.5.1 + Apache Solr 4.0
> 
> Hi,
> 
> I just saw there is a schema-solr4.xml and a schema.xml in the nutch conf
> directory. But with both schemas I get the same errors when starting up
> solr.
> Heres the stacktrace:
> 
> Nov 8, 2012 3:32:14 PM org.apache.solr.core.SolrConfig <init>
> INFO: Loaded SolrConfig: solrconfig.xml
> Nov 8, 2012 3:32:14 PM org.apache.solr.schema.IndexSchema readSchema
> INFO: Reading Solr Schema
> Nov 8, 2012 3:32:14 PM org.apache.solr.schema.IndexSchema readSchema
> INFO: Schema name=nutch
> Nov 8, 2012 3:32:14 PM org.apache.solr.core.CoreContainer create
> SEVERE: Unable to create core: collection1
> org.apache.solr.common.SolrException: Schema Parsing Failed: multiple points
>         at
> org.apache.solr.schema.IndexSchema.readSchema(IndexSchema.java:571)
>         at org.apache.solr.schema.IndexSchema.<init>(IndexSchema.java:113)
>         at org.apache.solr.core.CoreContainer.create(CoreContainer.java:846)
>         at org.apache.solr.core.CoreContainer.load(CoreContainer.java:534)
>         at org.apache.solr.core.CoreContainer.load(CoreContainer.java:356)
>         at
> org.apache.solr.core.CoreContainer$Initializer.initialize(CoreContainer.java:308)
>         at
> org.apache.solr.servlet.SolrDispatchFilter.init(SolrDispatchFilter.java:107)
>         at
> org.eclipse.jetty.servlet.FilterHolder.doStart(FilterHolder.java:114)
>         at
> org.eclipse.jetty.util.component.AbstractLifeCycle.start(AbstractLifeCycle.java:59)
>         at
> org.eclipse.jetty.servlet.ServletHandler.initialize(ServletHandler.java:754)
>         at
> org.eclipse.jetty.servlet.ServletContextHandler.startContext(ServletContextHandler.java:258)
>         at
> org.eclipse.jetty.webapp.WebAppContext.startContext(WebAppContext.java:1221)
>         at
> org.eclipse.jetty.server.handler.ContextHandler.doStart(ContextHandler.java:699)
>         at
> org.eclipse.jetty.webapp.WebAppContext.doStart(WebAppContext.java:454)
>         at
> org.eclipse.jetty.util.component.AbstractLifeCycle.start(AbstractLifeCycle.java:59)
>         at
> org.eclipse.jetty.deploy.bindings.StandardStarter.processBinding(StandardStarter.java:36)
>         at
> org.eclipse.jetty.deploy.AppLifeCycle.runBindings(AppLifeCycle.java:183)
>         at
> org.eclipse.jetty.deploy.DeploymentManager.requestAppGoal(DeploymentManager.java:491)
>         at
> org.eclipse.jetty.deploy.DeploymentManager.addApp(DeploymentManager.java:138)
>         at
> org.eclipse.jetty.deploy.providers.ScanningAppProvider.fileAdded(ScanningAppProvider.java:142)
>         at
> org.eclipse.jetty.deploy.providers.ScanningAppProvider$1.fileAdded(ScanningAppProvider.java:53)
>         at org.eclipse.jetty.util.Scanner.reportAddition(Scanner.java:604)
>         at
> org.eclipse.jetty.util.Scanner.reportDifferences(Scanner.java:535)
>         at org.eclipse.jetty.util.Scanner.scan(Scanner.java:398)
>         at org.eclipse.jetty.util.Scanner.doStart(Scanner.java:332)
>         at
> org.eclipse.jetty.util.component.AbstractLifeCycle.start(AbstractLifeCycle.java:59)
>         at
> org.eclipse.jetty.deploy.providers.ScanningAppProvider.doStart(ScanningAppProvider.java:118)
>         at
> org.eclipse.jetty.util.component.AbstractLifeCycle.start(AbstractLifeCycle.java:59)
>         at
> org.eclipse.jetty.deploy.DeploymentManager.startAppProvider(DeploymentManager.java:552)
>         at
> org.eclipse.jetty.deploy.DeploymentManager.doStart(DeploymentManager.java:227)
>         at
> org.eclipse.jetty.util.component.AbstractLifeCycle.start(AbstractLifeCycle.java:59)
>         at
> org.eclipse.jetty.util.component.AggregateLifeCycle.doStart(AggregateLifeCycle.java:63)
>         at
> org.eclipse.jetty.server.handler.AbstractHandler.doStart(AbstractHandler.java:53)
>         at
> org.eclipse.jetty.server.handler.HandlerWrapper.doStart(HandlerWrapper.java:91)
>         at org.eclipse.jetty.server.Server.doStart(Server.java:263)
>         at
> org.eclipse.jetty.util.component.AbstractLifeCycle.start(AbstractLifeCycle.java:59)
>         at
> org.eclipse.jetty.xml.XmlConfiguration$1.run(XmlConfiguration.java:1215)
>         at java.security.AccessController.doPrivileged(Native Method)
>         at
> org.eclipse.jetty.xml.XmlConfiguration.main(XmlConfiguration.java:1138)
>         at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>         at
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
>         at
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
>         at java.lang.reflect.Method.invoke(Method.java:597)
>         at org.eclipse.jetty.start.Main.invokeMain(Main.java:457)
>         at org.eclipse.jetty.start.Main.start(Main.java:602)
>         at org.eclipse.jetty.start.Main.main(Main.java:82)
> Caused by: java.lang.NumberFormatException: multiple points
>         at
> sun.misc.FloatingDecimal.readJavaFormatString(FloatingDecimal.java:1082)
>         at java.lang.Float.parseFloat(Float.java:422)
>         at org.apache.solr.core.Config.getFloat(Config.java:284)
>         at
> org.apache.solr.schema.IndexSchema.readSchema(IndexSchema.java:358)
>         ... 45 more
> Nov 8, 2012 3:32:14 PM org.apache.solr.common.SolrException log
> SEVERE: null:org.apache.solr.common.SolrException: Schema Parsing Failed:
> multiple points
>         at
> org.apache.solr.schema.IndexSchema.readSchema(IndexSchema.java:571)
>         at org.apache.solr.schema.IndexSchema.<init>(IndexSchema.java:113)
>         at org.apache.solr.core.CoreContainer.create(CoreContainer.java:846)
>         at org.apache.solr.core.CoreContainer.load(CoreContainer.java:534)
>         at org.apache.solr.core.CoreContainer.load(CoreContainer.java:356)
>         at
> org.apache.solr.core.CoreContainer$Initializer.initialize(CoreContainer.java:308)
>         at
> org.apache.solr.servlet.SolrDispatchFilter.init(SolrDispatchFilter.java:107)
>         at
> org.eclipse.jetty.servlet.FilterHolder.doStart(FilterHolder.java:114)
>         at
> org.eclipse.jetty.util.component.AbstractLifeCycle.start(AbstractLifeCycle.java:59)
>         at
> org.eclipse.jetty.servlet.ServletHandler.initialize(ServletHandler.java:754)
>         at
> org.eclipse.jetty.servlet.ServletContextHandler.startContext(ServletContextHandler.java:258)
>         at
> org.eclipse.jetty.webapp.WebAppContext.startContext(WebAppContext.java:1221)
>         at
> org.eclipse.jetty.server.handler.ContextHandler.doStart(ContextHandler.java:699)
>         at
> org.eclipse.jetty.webapp.WebAppContext.doStart(WebAppContext.java:454)
>         at
> org.eclipse.jetty.util.component.AbstractLifeCycle.start(AbstractLifeCycle.java:59)
>         at
> org.eclipse.jetty.deploy.bindings.StandardStarter.processBinding(StandardStarter.java:36)
>         at
> org.eclipse.jetty.deploy.AppLifeCycle.runBindings(AppLifeCycle.java:183)
>         at
> org.eclipse.jetty.deploy.DeploymentManager.requestAppGoal(DeploymentManager.java:491)
>         at
> org.eclipse.jetty.deploy.DeploymentManager.addApp(DeploymentManager.java:138)
>         at
> org.eclipse.jetty.deploy.providers.ScanningAppProvider.fileAdded(ScanningAppProvider.java:142)
>         at
> org.eclipse.jetty.deploy.providers.ScanningAppProvider$1.fileAdded(ScanningAppProvider.java:53)
>         at org.eclipse.jetty.util.Scanner.reportAddition(Scanner.java:604)
>         at
> org.eclipse.jetty.util.Scanner.reportDifferences(Scanner.java:535)
>         at org.eclipse.jetty.util.Scanner.scan(Scanner.java:398)
>         at org.eclipse.jetty.util.Scanner.doStart(Scanner.java:332)
>         at
> org.eclipse.jetty.util.component.AbstractLifeCycle.start(AbstractLifeCycle.java:59)
>         at
> org.eclipse.jetty.deploy.providers.ScanningAppProvider.doStart(ScanningAppProvider.java:118)
>         at
> org.eclipse.jetty.util.component.AbstractLifeCycle.start(AbstractLifeCycle.java:59)
>         at
> org.eclipse.jetty.deploy.DeploymentManager.startAppProvider(DeploymentManager.java:552)
>         at
> org.eclipse.jetty.deploy.DeploymentManager.doStart(DeploymentManager.java:227)
>         at
> org.eclipse.jetty.util.component.AbstractLifeCycle.start(AbstractLifeCycle.java:59)
>         at
> org.eclipse.jetty.util.component.AggregateLifeCycle.doStart(AggregateLifeCycle.java:63)
>         at
> org.eclipse.jetty.server.handler.AbstractHandler.doStart(AbstractHandler.java:53)
>         at
> org.eclipse.jetty.server.handler.HandlerWrapper.doStart(HandlerWrapper.java:91)
>         at org.eclipse.jetty.server.Server.doStart(Server.java:263)
>         at
> org.eclipse.jetty.util.component.AbstractLifeCycle.start(AbstractLifeCycle.java:59)
>         at
> org.eclipse.jetty.xml.XmlConfiguration$1.run(XmlConfiguration.java:1215)
>         at java.security.AccessController.doPrivileged(Native Method)
>         at
> org.eclipse.jetty.xml.XmlConfiguration.main(XmlConfiguration.java:1138)
>         at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>         at
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
>         at
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
>         at java.lang.reflect.Method.invoke(Method.java:597)
>         at org.eclipse.jetty.start.Main.invokeMain(Main.java:457)
>         at org.eclipse.jetty.start.Main.start(Main.java:602)
>         at org.eclipse.jetty.start.Main.main(Main.java:82)
> Caused by: java.lang.NumberFormatException: multiple points
>         at
> sun.misc.FloatingDecimal.readJavaFormatString(FloatingDecimal.java:1082)
>         at java.lang.Float.parseFloat(Float.java:422)
>         at org.apache.solr.core.Config.getFloat(Config.java:284)
>         at
> org.apache.solr.schema.IndexSchema.readSchema(IndexSchema.java:358)
>         ... 45 more
> 
> Regards
> Antony
> 
> 
> 2012/11/8 Markus Jelsma <ma...@openindex.io>
> 
> > Hi - it fixes it here. Please post the full stack trace.
> >
> > -----Original message-----
> > > From:Antony Steiner <an...@gmail.com>
> > > Sent: Thu 08-Nov-2012 15:16
> > > To: solr-user@lucene.apache.org
> > > Subject: Re: Apache Nutch 1.5.1 + Apache Solr 4.0
> > >
> > > Hi,
> > >
> > > Thank you for your sugestion. Nope, it didn't change anything. Should I
> > > post the full stacktrace?
> > >
> > > Regards
> > > Antony
> > >
> > >
> > > 2012/11/8 Markus Jelsma <ma...@openindex.io>
> > >
> > > > Hi,
> > > >
> > > > Your Nutch schema likely points to the old EnglishPorterFilter that
> > > > doesn't exist anymore. You can change that occurance to
> > > > PorterStemFilterFactory, that should fix the issue.
> > > >
> > > > -----Original message-----
> > > > > From:Antony Steiner <an...@gmail.com>
> > > > > Sent: Thu 08-Nov-2012 14:05
> > > > > To: solr-user@lucene.apache.org
> > > > > Subject: Apache Nutch 1.5.1 + Apache Solr 4.0
> > > > >
> > > > > Hello my name is Antony and I'm new to apache nutch and solr.
> > > > >
> > > > > I want to crawl my website and therefore I downloaded nutch to do
> > this.
> > > > > This works fine. But no I would like to integrate nutch with solr. Im
> > > > > running this on my unix system.
> > > > > Im trying to follow this tutorial:
> > > > > http://wiki.apache.org/nutch/NutchTutorial
> > > > > But it wont for me. Running Solr without nutch is no problem. I can
> > post
> > > > > documents to solr with post.jar. But what I want to do is post my
> > nutch
> > > > > crawl to solr.
> > > > > Now if I copy the schema.xml from nutch to
> > > > > apache-solr-4.0.0/example/solr/collection1/conf directory aned
> > restart
> > > > solr
> > > > > (java -jar start.jar), I get compiling errors but Solr will start.
> > (Is
> > > > this
> > > > > the correct directory to copy my schema?)
> > > > >
> > > > > Nov 8, 2012 9:40:33 AM org.apache.solr.schema.IndexSchema readSchema
> > > > > INFO: Schema name=nutch
> > > > > Nov 8, 2012 9:40:33 AM org.apache.solr.core.CoreContainer create
> > > > > SEVERE: Unable to create core: collection1
> > > > > org.apache.solr.common.SolrException: Schema Parsing Failed: multiple
> > > > points
> > > > >         at
> > > > > org.apache.solr.schema.IndexSchema.readSchema(IndexSchema.java:571)
> > > > >         at
> > > > org.apache.solr.schema.IndexSchema.<init>(IndexSchema.java:113)
> > > > > ...
> > > > >
> > > > > Nov 8, 2012 9:40:33 AM org.apache.solr.common.SolrException log
> > > > > SEVERE: null:org.apache.solr.common.SolrException: Schema Parsing
> > Failed:
> > > > > multiple points
> > > > >         at
> > > > > org.apache.solr.schema.IndexSchema.readSchema(IndexSchema.java:571)
> > > > >         at
> > > > org.apache.solr.schema.IndexSchema.<init>(IndexSchema.java:113)
> > > > >         at
> > > > org.apache.solr.core.CoreContainer.create(CoreContainer.java:846)
> > > > > ...
> > > > >
> > > > > Now if I don't copy the schema and push my nutch crawl to solr I get
> > > > > following error:
> > > > >
> > > > > SolrIndexer: starting at 2012-11-08 10:49:02
> > > > > Indexing 5 documents
> > > > > java.io.IOException: Job failed!
> > > > > SolrDeleteDuplicates: starting at 2012-11-08 10:49:47
> > > > > SolrDeleteDuplicates: Solr url: http://photon:8983/solr/
> > > > >
> > > > > And this is taken from the logging:
> > > > > org.apache.solr.common.SolrException: ERROR: [doc=
> > > > > http://e-docs/infrastructure/cpuload_monitor.html] unknown field
> > 'host'
> > > > >
> > > > > What should I do or what am I missing?
> > > > >
> > > > > I hope you can help me
> > > > > Best Regards
> > > > > Antony
> > > > >
> > > >
> > >
> >
> 

RE: Apache Nutch 1.5.1 + Apache Solr 4.0

Posted by Markus Jelsma <ma...@openindex.io>.
You're in SolrCloud mode, it needs that field. Just put it as explained in the error somewhere within your <fields> element.
 
 
-----Original message-----
> From:Antony Steiner <an...@gmail.com>
> Sent: Mon 12-Nov-2012 14:25
> To: solr-user@lucene.apache.org
> Subject: Re: Apache Nutch 1.5.1 + Apache Solr 4.0
> 
> Hello guys
> 
> thank you for your input. I now took the schema from the trunk. This helped
> me: <schema name="nutch" version="1.5"> I had another version which has the
> version number 1.5.1.
> I changed EnglishPorterFilter to PorterStemFilterFactory. But I still keep
> failing at starting up solr. Following error:
> 
> Nov 12, 2012 1:55:58 PM org.apache.solr.update.UpdateLog init
> SEVERE: Unable to use updateLog: _version_field must exist in schema, using
> indexed="true" stored="true" and multiValued="false" (_version_ does not
> exist)
> org.apache.solr.common.SolrException: _version_field must exist in schema,
> using indexed="true" stored="true" and multiValued="false" (_version_ does
> not exist)
> 
> Do I have to add a _version_ field? where do I put that?
> 
> Regards
> Antony
> 

Re: Apache Nutch 1.5.1 + Apache Solr 4.0

Posted by Rafał Kuć <r....@solr.pl>.
Hello!

Add the following field to your schema.xml file to the fields
sections:

<field name="_version_" type="long" indexed="true" stored="true"/>

-- 
Regards,
 Rafał Kuć
 Sematext :: http://sematext.com/ :: Solr - Lucene - Nutch - ElasticSearch

> Hello guys

> thank you for your input. I now took the schema from the trunk. This helped
> me: <schema name="nutch" version="1.5"> I had another version which has the
> version number 1.5.1.
> I changed EnglishPorterFilter to PorterStemFilterFactory. But I still keep
> failing at starting up solr. Following error:

> Nov 12, 2012 1:55:58 PM org.apache.solr.update.UpdateLog init
> SEVERE: Unable to use updateLog: _version_field must exist in schema, using
> indexed="true" stored="true" and multiValued="false" (_version_ does not
> exist)
> org.apache.solr.common.SolrException: _version_field must exist in schema,
> using indexed="true" stored="true" and multiValued="false" (_version_ does
> not exist)

> Do I have to add a _version_ field? where do I put that?

> Regards
> Antony


Re: Apache Nutch 1.5.1 + Apache Solr 4.0

Posted by Antony Steiner <an...@gmail.com>.
Thank you very much. Everything is working fine now.

Best regards
Antony

Re: Apache Nutch 1.5.1 + Apache Solr 4.0

Posted by Antony Steiner <an...@gmail.com>.
Hello guys

thank you for your input. I now took the schema from the trunk. This helped
me: <schema name="nutch" version="1.5"> I had another version which has the
version number 1.5.1.
I changed EnglishPorterFilter to PorterStemFilterFactory. But I still keep
failing at starting up solr. Following error:

Nov 12, 2012 1:55:58 PM org.apache.solr.update.UpdateLog init
SEVERE: Unable to use updateLog: _version_field must exist in schema, using
indexed="true" stored="true" and multiValued="false" (_version_ does not
exist)
org.apache.solr.common.SolrException: _version_field must exist in schema,
using indexed="true" stored="true" and multiValued="false" (_version_ does
not exist)

Do I have to add a _version_ field? where do I put that?

Regards
Antony

Re: Apache Nutch 1.5.1 + Apache Solr 4.0

Posted by Antony Steiner <an...@gmail.com>.
Hi,

I just saw there is a schema-solr4.xml and a schema.xml in the nutch conf
directory. But with both schemas I get the same errors when starting up
solr.
Heres the stacktrace:

Nov 8, 2012 3:32:14 PM org.apache.solr.core.SolrConfig <init>
INFO: Loaded SolrConfig: solrconfig.xml
Nov 8, 2012 3:32:14 PM org.apache.solr.schema.IndexSchema readSchema
INFO: Reading Solr Schema
Nov 8, 2012 3:32:14 PM org.apache.solr.schema.IndexSchema readSchema
INFO: Schema name=nutch
Nov 8, 2012 3:32:14 PM org.apache.solr.core.CoreContainer create
SEVERE: Unable to create core: collection1
org.apache.solr.common.SolrException: Schema Parsing Failed: multiple points
        at
org.apache.solr.schema.IndexSchema.readSchema(IndexSchema.java:571)
        at org.apache.solr.schema.IndexSchema.<init>(IndexSchema.java:113)
        at org.apache.solr.core.CoreContainer.create(CoreContainer.java:846)
        at org.apache.solr.core.CoreContainer.load(CoreContainer.java:534)
        at org.apache.solr.core.CoreContainer.load(CoreContainer.java:356)
        at
org.apache.solr.core.CoreContainer$Initializer.initialize(CoreContainer.java:308)
        at
org.apache.solr.servlet.SolrDispatchFilter.init(SolrDispatchFilter.java:107)
        at
org.eclipse.jetty.servlet.FilterHolder.doStart(FilterHolder.java:114)
        at
org.eclipse.jetty.util.component.AbstractLifeCycle.start(AbstractLifeCycle.java:59)
        at
org.eclipse.jetty.servlet.ServletHandler.initialize(ServletHandler.java:754)
        at
org.eclipse.jetty.servlet.ServletContextHandler.startContext(ServletContextHandler.java:258)
        at
org.eclipse.jetty.webapp.WebAppContext.startContext(WebAppContext.java:1221)
        at
org.eclipse.jetty.server.handler.ContextHandler.doStart(ContextHandler.java:699)
        at
org.eclipse.jetty.webapp.WebAppContext.doStart(WebAppContext.java:454)
        at
org.eclipse.jetty.util.component.AbstractLifeCycle.start(AbstractLifeCycle.java:59)
        at
org.eclipse.jetty.deploy.bindings.StandardStarter.processBinding(StandardStarter.java:36)
        at
org.eclipse.jetty.deploy.AppLifeCycle.runBindings(AppLifeCycle.java:183)
        at
org.eclipse.jetty.deploy.DeploymentManager.requestAppGoal(DeploymentManager.java:491)
        at
org.eclipse.jetty.deploy.DeploymentManager.addApp(DeploymentManager.java:138)
        at
org.eclipse.jetty.deploy.providers.ScanningAppProvider.fileAdded(ScanningAppProvider.java:142)
        at
org.eclipse.jetty.deploy.providers.ScanningAppProvider$1.fileAdded(ScanningAppProvider.java:53)
        at org.eclipse.jetty.util.Scanner.reportAddition(Scanner.java:604)
        at
org.eclipse.jetty.util.Scanner.reportDifferences(Scanner.java:535)
        at org.eclipse.jetty.util.Scanner.scan(Scanner.java:398)
        at org.eclipse.jetty.util.Scanner.doStart(Scanner.java:332)
        at
org.eclipse.jetty.util.component.AbstractLifeCycle.start(AbstractLifeCycle.java:59)
        at
org.eclipse.jetty.deploy.providers.ScanningAppProvider.doStart(ScanningAppProvider.java:118)
        at
org.eclipse.jetty.util.component.AbstractLifeCycle.start(AbstractLifeCycle.java:59)
        at
org.eclipse.jetty.deploy.DeploymentManager.startAppProvider(DeploymentManager.java:552)
        at
org.eclipse.jetty.deploy.DeploymentManager.doStart(DeploymentManager.java:227)
        at
org.eclipse.jetty.util.component.AbstractLifeCycle.start(AbstractLifeCycle.java:59)
        at
org.eclipse.jetty.util.component.AggregateLifeCycle.doStart(AggregateLifeCycle.java:63)
        at
org.eclipse.jetty.server.handler.AbstractHandler.doStart(AbstractHandler.java:53)
        at
org.eclipse.jetty.server.handler.HandlerWrapper.doStart(HandlerWrapper.java:91)
        at org.eclipse.jetty.server.Server.doStart(Server.java:263)
        at
org.eclipse.jetty.util.component.AbstractLifeCycle.start(AbstractLifeCycle.java:59)
        at
org.eclipse.jetty.xml.XmlConfiguration$1.run(XmlConfiguration.java:1215)
        at java.security.AccessController.doPrivileged(Native Method)
        at
org.eclipse.jetty.xml.XmlConfiguration.main(XmlConfiguration.java:1138)
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
        at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
        at java.lang.reflect.Method.invoke(Method.java:597)
        at org.eclipse.jetty.start.Main.invokeMain(Main.java:457)
        at org.eclipse.jetty.start.Main.start(Main.java:602)
        at org.eclipse.jetty.start.Main.main(Main.java:82)
Caused by: java.lang.NumberFormatException: multiple points
        at
sun.misc.FloatingDecimal.readJavaFormatString(FloatingDecimal.java:1082)
        at java.lang.Float.parseFloat(Float.java:422)
        at org.apache.solr.core.Config.getFloat(Config.java:284)
        at
org.apache.solr.schema.IndexSchema.readSchema(IndexSchema.java:358)
        ... 45 more
Nov 8, 2012 3:32:14 PM org.apache.solr.common.SolrException log
SEVERE: null:org.apache.solr.common.SolrException: Schema Parsing Failed:
multiple points
        at
org.apache.solr.schema.IndexSchema.readSchema(IndexSchema.java:571)
        at org.apache.solr.schema.IndexSchema.<init>(IndexSchema.java:113)
        at org.apache.solr.core.CoreContainer.create(CoreContainer.java:846)
        at org.apache.solr.core.CoreContainer.load(CoreContainer.java:534)
        at org.apache.solr.core.CoreContainer.load(CoreContainer.java:356)
        at
org.apache.solr.core.CoreContainer$Initializer.initialize(CoreContainer.java:308)
        at
org.apache.solr.servlet.SolrDispatchFilter.init(SolrDispatchFilter.java:107)
        at
org.eclipse.jetty.servlet.FilterHolder.doStart(FilterHolder.java:114)
        at
org.eclipse.jetty.util.component.AbstractLifeCycle.start(AbstractLifeCycle.java:59)
        at
org.eclipse.jetty.servlet.ServletHandler.initialize(ServletHandler.java:754)
        at
org.eclipse.jetty.servlet.ServletContextHandler.startContext(ServletContextHandler.java:258)
        at
org.eclipse.jetty.webapp.WebAppContext.startContext(WebAppContext.java:1221)
        at
org.eclipse.jetty.server.handler.ContextHandler.doStart(ContextHandler.java:699)
        at
org.eclipse.jetty.webapp.WebAppContext.doStart(WebAppContext.java:454)
        at
org.eclipse.jetty.util.component.AbstractLifeCycle.start(AbstractLifeCycle.java:59)
        at
org.eclipse.jetty.deploy.bindings.StandardStarter.processBinding(StandardStarter.java:36)
        at
org.eclipse.jetty.deploy.AppLifeCycle.runBindings(AppLifeCycle.java:183)
        at
org.eclipse.jetty.deploy.DeploymentManager.requestAppGoal(DeploymentManager.java:491)
        at
org.eclipse.jetty.deploy.DeploymentManager.addApp(DeploymentManager.java:138)
        at
org.eclipse.jetty.deploy.providers.ScanningAppProvider.fileAdded(ScanningAppProvider.java:142)
        at
org.eclipse.jetty.deploy.providers.ScanningAppProvider$1.fileAdded(ScanningAppProvider.java:53)
        at org.eclipse.jetty.util.Scanner.reportAddition(Scanner.java:604)
        at
org.eclipse.jetty.util.Scanner.reportDifferences(Scanner.java:535)
        at org.eclipse.jetty.util.Scanner.scan(Scanner.java:398)
        at org.eclipse.jetty.util.Scanner.doStart(Scanner.java:332)
        at
org.eclipse.jetty.util.component.AbstractLifeCycle.start(AbstractLifeCycle.java:59)
        at
org.eclipse.jetty.deploy.providers.ScanningAppProvider.doStart(ScanningAppProvider.java:118)
        at
org.eclipse.jetty.util.component.AbstractLifeCycle.start(AbstractLifeCycle.java:59)
        at
org.eclipse.jetty.deploy.DeploymentManager.startAppProvider(DeploymentManager.java:552)
        at
org.eclipse.jetty.deploy.DeploymentManager.doStart(DeploymentManager.java:227)
        at
org.eclipse.jetty.util.component.AbstractLifeCycle.start(AbstractLifeCycle.java:59)
        at
org.eclipse.jetty.util.component.AggregateLifeCycle.doStart(AggregateLifeCycle.java:63)
        at
org.eclipse.jetty.server.handler.AbstractHandler.doStart(AbstractHandler.java:53)
        at
org.eclipse.jetty.server.handler.HandlerWrapper.doStart(HandlerWrapper.java:91)
        at org.eclipse.jetty.server.Server.doStart(Server.java:263)
        at
org.eclipse.jetty.util.component.AbstractLifeCycle.start(AbstractLifeCycle.java:59)
        at
org.eclipse.jetty.xml.XmlConfiguration$1.run(XmlConfiguration.java:1215)
        at java.security.AccessController.doPrivileged(Native Method)
        at
org.eclipse.jetty.xml.XmlConfiguration.main(XmlConfiguration.java:1138)
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
        at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
        at java.lang.reflect.Method.invoke(Method.java:597)
        at org.eclipse.jetty.start.Main.invokeMain(Main.java:457)
        at org.eclipse.jetty.start.Main.start(Main.java:602)
        at org.eclipse.jetty.start.Main.main(Main.java:82)
Caused by: java.lang.NumberFormatException: multiple points
        at
sun.misc.FloatingDecimal.readJavaFormatString(FloatingDecimal.java:1082)
        at java.lang.Float.parseFloat(Float.java:422)
        at org.apache.solr.core.Config.getFloat(Config.java:284)
        at
org.apache.solr.schema.IndexSchema.readSchema(IndexSchema.java:358)
        ... 45 more

Regards
Antony


2012/11/8 Markus Jelsma <ma...@openindex.io>

> Hi - it fixes it here. Please post the full stack trace.
>
> -----Original message-----
> > From:Antony Steiner <an...@gmail.com>
> > Sent: Thu 08-Nov-2012 15:16
> > To: solr-user@lucene.apache.org
> > Subject: Re: Apache Nutch 1.5.1 + Apache Solr 4.0
> >
> > Hi,
> >
> > Thank you for your sugestion. Nope, it didn't change anything. Should I
> > post the full stacktrace?
> >
> > Regards
> > Antony
> >
> >
> > 2012/11/8 Markus Jelsma <ma...@openindex.io>
> >
> > > Hi,
> > >
> > > Your Nutch schema likely points to the old EnglishPorterFilter that
> > > doesn't exist anymore. You can change that occurance to
> > > PorterStemFilterFactory, that should fix the issue.
> > >
> > > -----Original message-----
> > > > From:Antony Steiner <an...@gmail.com>
> > > > Sent: Thu 08-Nov-2012 14:05
> > > > To: solr-user@lucene.apache.org
> > > > Subject: Apache Nutch 1.5.1 + Apache Solr 4.0
> > > >
> > > > Hello my name is Antony and I'm new to apache nutch and solr.
> > > >
> > > > I want to crawl my website and therefore I downloaded nutch to do
> this.
> > > > This works fine. But no I would like to integrate nutch with solr. Im
> > > > running this on my unix system.
> > > > Im trying to follow this tutorial:
> > > > http://wiki.apache.org/nutch/NutchTutorial
> > > > But it wont for me. Running Solr without nutch is no problem. I can
> post
> > > > documents to solr with post.jar. But what I want to do is post my
> nutch
> > > > crawl to solr.
> > > > Now if I copy the schema.xml from nutch to
> > > > apache-solr-4.0.0/example/solr/collection1/conf directory aned
> restart
> > > solr
> > > > (java -jar start.jar), I get compiling errors but Solr will start.
> (Is
> > > this
> > > > the correct directory to copy my schema?)
> > > >
> > > > Nov 8, 2012 9:40:33 AM org.apache.solr.schema.IndexSchema readSchema
> > > > INFO: Schema name=nutch
> > > > Nov 8, 2012 9:40:33 AM org.apache.solr.core.CoreContainer create
> > > > SEVERE: Unable to create core: collection1
> > > > org.apache.solr.common.SolrException: Schema Parsing Failed: multiple
> > > points
> > > >         at
> > > > org.apache.solr.schema.IndexSchema.readSchema(IndexSchema.java:571)
> > > >         at
> > > org.apache.solr.schema.IndexSchema.<init>(IndexSchema.java:113)
> > > > ...
> > > >
> > > > Nov 8, 2012 9:40:33 AM org.apache.solr.common.SolrException log
> > > > SEVERE: null:org.apache.solr.common.SolrException: Schema Parsing
> Failed:
> > > > multiple points
> > > >         at
> > > > org.apache.solr.schema.IndexSchema.readSchema(IndexSchema.java:571)
> > > >         at
> > > org.apache.solr.schema.IndexSchema.<init>(IndexSchema.java:113)
> > > >         at
> > > org.apache.solr.core.CoreContainer.create(CoreContainer.java:846)
> > > > ...
> > > >
> > > > Now if I don't copy the schema and push my nutch crawl to solr I get
> > > > following error:
> > > >
> > > > SolrIndexer: starting at 2012-11-08 10:49:02
> > > > Indexing 5 documents
> > > > java.io.IOException: Job failed!
> > > > SolrDeleteDuplicates: starting at 2012-11-08 10:49:47
> > > > SolrDeleteDuplicates: Solr url: http://photon:8983/solr/
> > > >
> > > > And this is taken from the logging:
> > > > org.apache.solr.common.SolrException: ERROR: [doc=
> > > > http://e-docs/infrastructure/cpuload_monitor.html] unknown field
> 'host'
> > > >
> > > > What should I do or what am I missing?
> > > >
> > > > I hope you can help me
> > > > Best Regards
> > > > Antony
> > > >
> > >
> >
>

RE: Apache Nutch 1.5.1 + Apache Solr 4.0

Posted by Markus Jelsma <ma...@openindex.io>.
Hi - it fixes it here. Please post the full stack trace.
 
-----Original message-----
> From:Antony Steiner <an...@gmail.com>
> Sent: Thu 08-Nov-2012 15:16
> To: solr-user@lucene.apache.org
> Subject: Re: Apache Nutch 1.5.1 + Apache Solr 4.0
> 
> Hi,
> 
> Thank you for your sugestion. Nope, it didn't change anything. Should I
> post the full stacktrace?
> 
> Regards
> Antony
> 
> 
> 2012/11/8 Markus Jelsma <ma...@openindex.io>
> 
> > Hi,
> >
> > Your Nutch schema likely points to the old EnglishPorterFilter that
> > doesn't exist anymore. You can change that occurance to
> > PorterStemFilterFactory, that should fix the issue.
> >
> > -----Original message-----
> > > From:Antony Steiner <an...@gmail.com>
> > > Sent: Thu 08-Nov-2012 14:05
> > > To: solr-user@lucene.apache.org
> > > Subject: Apache Nutch 1.5.1 + Apache Solr 4.0
> > >
> > > Hello my name is Antony and I'm new to apache nutch and solr.
> > >
> > > I want to crawl my website and therefore I downloaded nutch to do this.
> > > This works fine. But no I would like to integrate nutch with solr. Im
> > > running this on my unix system.
> > > Im trying to follow this tutorial:
> > > http://wiki.apache.org/nutch/NutchTutorial
> > > But it wont for me. Running Solr without nutch is no problem. I can post
> > > documents to solr with post.jar. But what I want to do is post my nutch
> > > crawl to solr.
> > > Now if I copy the schema.xml from nutch to
> > > apache-solr-4.0.0/example/solr/collection1/conf directory aned restart
> > solr
> > > (java -jar start.jar), I get compiling errors but Solr will start. (Is
> > this
> > > the correct directory to copy my schema?)
> > >
> > > Nov 8, 2012 9:40:33 AM org.apache.solr.schema.IndexSchema readSchema
> > > INFO: Schema name=nutch
> > > Nov 8, 2012 9:40:33 AM org.apache.solr.core.CoreContainer create
> > > SEVERE: Unable to create core: collection1
> > > org.apache.solr.common.SolrException: Schema Parsing Failed: multiple
> > points
> > >         at
> > > org.apache.solr.schema.IndexSchema.readSchema(IndexSchema.java:571)
> > >         at
> > org.apache.solr.schema.IndexSchema.<init>(IndexSchema.java:113)
> > > ...
> > >
> > > Nov 8, 2012 9:40:33 AM org.apache.solr.common.SolrException log
> > > SEVERE: null:org.apache.solr.common.SolrException: Schema Parsing Failed:
> > > multiple points
> > >         at
> > > org.apache.solr.schema.IndexSchema.readSchema(IndexSchema.java:571)
> > >         at
> > org.apache.solr.schema.IndexSchema.<init>(IndexSchema.java:113)
> > >         at
> > org.apache.solr.core.CoreContainer.create(CoreContainer.java:846)
> > > ...
> > >
> > > Now if I don't copy the schema and push my nutch crawl to solr I get
> > > following error:
> > >
> > > SolrIndexer: starting at 2012-11-08 10:49:02
> > > Indexing 5 documents
> > > java.io.IOException: Job failed!
> > > SolrDeleteDuplicates: starting at 2012-11-08 10:49:47
> > > SolrDeleteDuplicates: Solr url: http://photon:8983/solr/
> > >
> > > And this is taken from the logging:
> > > org.apache.solr.common.SolrException: ERROR: [doc=
> > > http://e-docs/infrastructure/cpuload_monitor.html] unknown field 'host'
> > >
> > > What should I do or what am I missing?
> > >
> > > I hope you can help me
> > > Best Regards
> > > Antony
> > >
> >
> 

Re: Apache Nutch 1.5.1 + Apache Solr 4.0

Posted by Antony Steiner <an...@gmail.com>.
Hi,

Thank you for your sugestion. Nope, it didn't change anything. Should I
post the full stacktrace?

Regards
Antony


2012/11/8 Markus Jelsma <ma...@openindex.io>

> Hi,
>
> Your Nutch schema likely points to the old EnglishPorterFilter that
> doesn't exist anymore. You can change that occurance to
> PorterStemFilterFactory, that should fix the issue.
>
> -----Original message-----
> > From:Antony Steiner <an...@gmail.com>
> > Sent: Thu 08-Nov-2012 14:05
> > To: solr-user@lucene.apache.org
> > Subject: Apache Nutch 1.5.1 + Apache Solr 4.0
> >
> > Hello my name is Antony and I'm new to apache nutch and solr.
> >
> > I want to crawl my website and therefore I downloaded nutch to do this.
> > This works fine. But no I would like to integrate nutch with solr. Im
> > running this on my unix system.
> > Im trying to follow this tutorial:
> > http://wiki.apache.org/nutch/NutchTutorial
> > But it wont for me. Running Solr without nutch is no problem. I can post
> > documents to solr with post.jar. But what I want to do is post my nutch
> > crawl to solr.
> > Now if I copy the schema.xml from nutch to
> > apache-solr-4.0.0/example/solr/collection1/conf directory aned restart
> solr
> > (java -jar start.jar), I get compiling errors but Solr will start. (Is
> this
> > the correct directory to copy my schema?)
> >
> > Nov 8, 2012 9:40:33 AM org.apache.solr.schema.IndexSchema readSchema
> > INFO: Schema name=nutch
> > Nov 8, 2012 9:40:33 AM org.apache.solr.core.CoreContainer create
> > SEVERE: Unable to create core: collection1
> > org.apache.solr.common.SolrException: Schema Parsing Failed: multiple
> points
> >         at
> > org.apache.solr.schema.IndexSchema.readSchema(IndexSchema.java:571)
> >         at
> org.apache.solr.schema.IndexSchema.<init>(IndexSchema.java:113)
> > ...
> >
> > Nov 8, 2012 9:40:33 AM org.apache.solr.common.SolrException log
> > SEVERE: null:org.apache.solr.common.SolrException: Schema Parsing Failed:
> > multiple points
> >         at
> > org.apache.solr.schema.IndexSchema.readSchema(IndexSchema.java:571)
> >         at
> org.apache.solr.schema.IndexSchema.<init>(IndexSchema.java:113)
> >         at
> org.apache.solr.core.CoreContainer.create(CoreContainer.java:846)
> > ...
> >
> > Now if I don't copy the schema and push my nutch crawl to solr I get
> > following error:
> >
> > SolrIndexer: starting at 2012-11-08 10:49:02
> > Indexing 5 documents
> > java.io.IOException: Job failed!
> > SolrDeleteDuplicates: starting at 2012-11-08 10:49:47
> > SolrDeleteDuplicates: Solr url: http://photon:8983/solr/
> >
> > And this is taken from the logging:
> > org.apache.solr.common.SolrException: ERROR: [doc=
> > http://e-docs/infrastructure/cpuload_monitor.html] unknown field 'host'
> >
> > What should I do or what am I missing?
> >
> > I hope you can help me
> > Best Regards
> > Antony
> >
>

RE: Apache Nutch 1.5.1 + Apache Solr 4.0

Posted by Markus Jelsma <ma...@openindex.io>.
Hi, 

Your Nutch schema likely points to the old EnglishPorterFilter that doesn't exist anymore. You can change that occurance to PorterStemFilterFactory, that should fix the issue. 
 
-----Original message-----
> From:Antony Steiner <an...@gmail.com>
> Sent: Thu 08-Nov-2012 14:05
> To: solr-user@lucene.apache.org
> Subject: Apache Nutch 1.5.1 + Apache Solr 4.0
> 
> Hello my name is Antony and I'm new to apache nutch and solr.
> 
> I want to crawl my website and therefore I downloaded nutch to do this.
> This works fine. But no I would like to integrate nutch with solr. Im
> running this on my unix system.
> Im trying to follow this tutorial:
> http://wiki.apache.org/nutch/NutchTutorial
> But it wont for me. Running Solr without nutch is no problem. I can post
> documents to solr with post.jar. But what I want to do is post my nutch
> crawl to solr.
> Now if I copy the schema.xml from nutch to
> apache-solr-4.0.0/example/solr/collection1/conf directory aned restart solr
> (java -jar start.jar), I get compiling errors but Solr will start. (Is this
> the correct directory to copy my schema?)
> 
> Nov 8, 2012 9:40:33 AM org.apache.solr.schema.IndexSchema readSchema
> INFO: Schema name=nutch
> Nov 8, 2012 9:40:33 AM org.apache.solr.core.CoreContainer create
> SEVERE: Unable to create core: collection1
> org.apache.solr.common.SolrException: Schema Parsing Failed: multiple points
>         at
> org.apache.solr.schema.IndexSchema.readSchema(IndexSchema.java:571)
>         at org.apache.solr.schema.IndexSchema.<init>(IndexSchema.java:113)
> ...
> 
> Nov 8, 2012 9:40:33 AM org.apache.solr.common.SolrException log
> SEVERE: null:org.apache.solr.common.SolrException: Schema Parsing Failed:
> multiple points
>         at
> org.apache.solr.schema.IndexSchema.readSchema(IndexSchema.java:571)
>         at org.apache.solr.schema.IndexSchema.<init>(IndexSchema.java:113)
>         at org.apache.solr.core.CoreContainer.create(CoreContainer.java:846)
> ...
> 
> Now if I don't copy the schema and push my nutch crawl to solr I get
> following error:
> 
> SolrIndexer: starting at 2012-11-08 10:49:02
> Indexing 5 documents
> java.io.IOException: Job failed!
> SolrDeleteDuplicates: starting at 2012-11-08 10:49:47
> SolrDeleteDuplicates: Solr url: http://photon:8983/solr/
> 
> And this is taken from the logging:
> org.apache.solr.common.SolrException: ERROR: [doc=
> http://e-docs/infrastructure/cpuload_monitor.html] unknown field 'host'
> 
> What should I do or what am I missing?
> 
> I hope you can help me
> Best Regards
> Antony
> 

Re: Apache Nutch 1.5.1 + Apache Solr 4.0

Posted by Iwan Hanjoyo <ih...@gmail.com>.
Hi Steiner,

    I found a video tutorial on Nutch 1.4 + Solr 3.4.0 (on Windows).
    It do solve my error. Hope it do for yours too.
    Here is the link:
    Running Nutch and Solr on Windows Tutorial: Part 1
    http://www.youtube.com/watch?v=baxhI6Wkov8
    Running Nutch and Solr on Windows Tutorial: Part 2
    http://www.youtube.com/watch?v=Qs-18hRRpNU
    Running Nutch and Solr on Windows Tutorial: Part 3
    http://www.youtube.com/watch?v=GtbDHiYrlNE

    Published on Mar 15, 2012 by Dutedute2

Kind regards,


Hanjoyo

On Thu, Nov 8, 2012 at 4:52 PM, Antony Steiner <an...@gmail.com>wrote:

> Hello my name is Antony and I'm new to apache nutch and solr.
>
> I want to crawl my website and therefore I downloaded nutch to do this.
> This works fine. But no I would like to integrate nutch with solr. Im
> running this on my unix system.
> Im trying to follow this tutorial:
> http://wiki.apache.org/nutch/NutchTutorial
> But it wont for me. Running Solr without nutch is no problem. I can post
> documents to solr with post.jar. But what I want to do is post my nutch
> crawl to solr.
> Now if I copy the schema.xml from nutch to
> apache-solr-4.0.0/example/solr/collection1/conf directory aned restart solr
> (java -jar start.jar), I get compiling errors but Solr will start. (Is this
> the correct directory to copy my schema?)
>
> Nov 8, 2012 9:40:33 AM org.apache.solr.schema.IndexSchema readSchema
> INFO: Schema name=nutch
> Nov 8, 2012 9:40:33 AM org.apache.solr.core.CoreContainer create
> SEVERE: Unable to create core: collection1
> org.apache.solr.common.SolrException: Schema Parsing Failed: multiple
> points
>         at
> org.apache.solr.schema.IndexSchema.readSchema(IndexSchema.java:571)
>         at org.apache.solr.schema.IndexSchema.<init>(IndexSchema.java:113)
> ...
>
> Nov 8, 2012 9:40:33 AM org.apache.solr.common.SolrException log
> SEVERE: null:org.apache.solr.common.SolrException: Schema Parsing Failed:
> multiple points
>         at
> org.apache.solr.schema.IndexSchema.readSchema(IndexSchema.java:571)
>         at org.apache.solr.schema.IndexSchema.<init>(IndexSchema.java:113)
>         at
> org.apache.solr.core.CoreContainer.create(CoreContainer.java:846)
> ...
>
> Now if I don't copy the schema and push my nutch crawl to solr I get
> following error:
>
> SolrIndexer: starting at 2012-11-08 10:49:02
> Indexing 5 documents
> java.io.IOException: Job failed!
> SolrDeleteDuplicates: starting at 2012-11-08 10:49:47
> SolrDeleteDuplicates: Solr url: http://photon:8983/solr/
>
> And this is taken from the logging:
> org.apache.solr.common.SolrException: ERROR: [doc=
> http://e-docs/infrastructure/cpuload_monitor.html] unknown field 'host'
>
> What should I do or what am I missing?
>
> I hope you can help me
> Best Regards
> Antony
>