You are viewing a plain text version of this content. The canonical link for it is here.
Posted to solr-user@lucene.apache.org by Brandon Waterloo <Br...@matrix.msu.edu> on 2011/03/21 21:00:02 UTC

Multiple Cores with Solr Cell for indexing documents

Hello everyone,

I've been trying for several hours now to set up Solr with multiple cores with Solr Cell working on each core.  The only items being indexed are PDF, DOC, and TXT files (with the possibility of expanding this list, but for now, just assume the only things in the index should be documents).

I never had any problems with Solr Cell when I was using a single core.  In fact, I just ran the default installation in example/ and worked from that.  However, trying to migrate to multi-core has been a never ending list of problems.

Any time I try to add a document to the index (using the same curl command as I did to add to the single core, of course adding the core name to the request URL-- host/solr/corename/update/extract...), I get HTTP 500 errors due to classes not being found and/or lazy loading errors.  I've copied the exact example/lib directory into the cores, and that doesn't work either.

Frankly the only libraries I want are those relevant to indexing files.  The less bloat, the better, after all.  However, I cannot figure out where to put what files, and why the example installation works perfectly for single-core but not with multi-cores.

Here is an example of the errors I'm receiving:

command prompt> curl "host/solr/core0/update/extract?literal.id=2-3-1&commit=true" -F "myfile=@test2.txt"

<html>
<head>
<meta http-equiv="Content-Type" content="text/html; charset=ISO-8859-1"/>
<title>Error 500 </title>
</head>
<body><h2>HTTP ERROR: 500</h2><pre>org/apache/tika/exception/TikaException

java.lang.NoClassDefFoundError: org/apache/tika/exception/TikaException
        at java.lang.Class.forName0(Native Method)
        at java.lang.Class.forName(Class.java:247)
        at org.apache.solr.core.SolrResourceLoader.findClass(SolrResourceLoader.java:359)
        at org.apache.solr.core.SolrCore.createInstance(SolrCore.java:413)
        at org.apache.solr.core.SolrCore.createRequestHandler(SolrCore.java:449)
        at org.apache.solr.core.RequestHandlers$LazyRequestHandlerWrapper.getWrappedHandler(RequestHandlers.java:240)
        at org.apache.solr.core.RequestHandlers$LazyRequestHandlerWrapper.handleRequest(RequestHandlers.java:231)
        at org.apache.solr.core.SolrCore.execute(SolrCore.java:1316)
        at org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java:338)
        at org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:241)
        at org.mortbay.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1089)
        at org.mortbay.jetty.servlet.ServletHandler.handle(ServletHandler.java:365)
        at org.mortbay.jetty.security.SecurityHandler.handle(SecurityHandler.java:216)
        at org.mortbay.jetty.servlet.SessionHandler.handle(SessionHandler.java:181)
        at org.mortbay.jetty.handler.ContextHandler.handle(ContextHandler.java:712)
        at org.mortbay.jetty.webapp.WebAppContext.handle(WebAppContext.java:405)
        at org.mortbay.jetty.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:211)
        at org.mortbay.jetty.handler.HandlerCollection.handle(HandlerCollection.java:114)
        at org.mortbay.jetty.handler.HandlerWrapper.handle(HandlerWrapper.java:139)
        at org.mortbay.jetty.Server.handle(Server.java:285)
        at org.mortbay.jetty.HttpConnection.handleRequest(HttpConnection.java:502)
        at org.mortbay.jetty.HttpConnection$RequestHandler.content(HttpConnection.java:835)
        at org.mortbay.jetty.HttpParser.parseNext(HttpParser.java:641)
        at org.mortbay.jetty.HttpParser.parseAvailable(HttpParser.java:202)
        at org.mortbay.jetty.HttpConnection.handle(HttpConnection.java:378)
        at org.mortbay.jetty.bio.SocketConnector$Connection.run(SocketConnector.java:226)
        at org.mortbay.thread.BoundedThreadPool$PoolThread.run(BoundedThreadPool.java:442)
Caused by: java.lang.ClassNotFoundException: org.apache.tika.exception.TikaException
        at java.net.URLClassLoader$1.run(URLClassLoader.java:202)
        at java.security.AccessController.doPrivileged(Native Method)
        at java.net.URLClassLoader.findClass(URLClassLoader.java:190)
        at java.lang.ClassLoader.loadClass(ClassLoader.java:307)
        at java.lang.ClassLoader.loadClass(ClassLoader.java:248)
        ... 27 more
</pre>
<p>RequestURI=/solr/core0/update/extract</p><p><i><small><a href="http://jetty.mortbay.org/">Powered by Jetty://</a></small></i></p><br/>
<br/>
<br/>
<br/>
<br/>
<br/>
<br/>
<br/>
<br/>
<br/>
<br/>
<br/>
<br/>
<br/>
<br/>
<br/>
<br/>
<br/>
<br/>
<br/>

</body>
</html>

Any assistance you could provide or installation guides/tutorials/etc. that you could link me to would be greatly appreciated.  Thank you all for your time!

~Brandon Waterloo


Re: Multiple Cores with Solr Cell for indexing documents

Posted by Erick Erickson <er...@gmail.com>.
Right, and you can go to sharding rather than managing your multiple
cores if thats warranted.....

Erick

On Fri, Mar 25, 2011 at 1:31 PM, Brandon Waterloo
<Br...@matrix.msu.edu> wrote:
> I did finally manage to deploy Solr with multiple cores but we've been running into so many problems with permissions, index location, and other things that I (quite fortunately) convinced my boss that multiple cores are not the way to go here.  I had in place a single-core system that would filter the results based on their ID numbers, and show only the subset of results that you wanted to see.  The disadvantage is that it's a single core and thus will take longer to search over the entire index.  The advantage is that it's better in every other way.
>
> So the plan now is to move back to single-core searching and then test it with a huge amount of documents to see whether performance is seriously impacted or not.  So for now, I guess we can consider this thread resolved.
>
> Thanks for all your help guys!
>
> ~Brandon Waterloo
>
>
> ________________________________________
> From: Markus Jelsma [markus.jelsma@openindex.io]
> Sent: Friday, March 25, 2011 1:23 PM
> To: solr-user@lucene.apache.org
> Cc: Upayavira
> Subject: Re: Multiple Cores with Solr Cell for indexing documents
>
> You can only set properties for a lib dir that must be used in solrconfig.xml.
> You can use sharedLib in solr.xml though.
>
>> There's options in solr.xml that point to lib dirs. Make sure you get
>> them right.
>>
>> Upayavira
>>
>> On Thu, 24 Mar 2011 23:28 +0100, "Markus Jelsma"
>>
>> <ma...@openindex.io> wrote:
>> > I believe it's example/solr/lib where it looks for shared libs in
>> > multicore.
>> > But, each core can has its own lib dir, usually in core/lib. This is
>> > referenced to in solrconfig.xml, see the example config for the lib
>> > directive.
>> >
>> > > Well, there lies the problem--it's not JUST the Tika jar.  If it's not
>> > > one thing, it's another, and I'm not even sure which directory Solr
>> > > actually looks in.  In my Solr.xml file I have it use a shared library
>> > > folder for every core.  Since each core will be holding very
>> > > homologous data, there's no need to have any different library modules
>> > > for each.
>> > >
>> > > The relevant line in my solr.xml file is <solr persistent="true"
>> > > sharedLib="lib">.  That is housed in .../example/solr/.  So, does it
>> > > look in .../example/lib or .../example/solr/lib?
>> > >
>> > > ~Brandon Waterloo
>> > > ________________________________________
>> > > From: Markus Jelsma [markus.jelsma@openindex.io]
>> > > Sent: Thursday, March 24, 2011 11:29 AM
>> > > To: solr-user@lucene.apache.org
>> > > Cc: Brandon Waterloo
>> > > Subject: Re: Multiple Cores with Solr Cell for indexing documents
>> > >
>> > > Sounds like the Tika jar is not on the class path. Add it to a
>> > > directory where Solr's looking for libs.
>> > >
>> > > On Thursday 24 March 2011 16:24:17 Brandon Waterloo wrote:
>> > > > Hello everyone,
>> > > >
>> > > > I've been trying for several hours now to set up Solr with multiple
>> > > > cores with Solr Cell working on each core. The only items being
>> > > > indexed are PDF, DOC, and TXT files (with the possibility of
>> > > > expanding this list, but for now, just assume the only things in the
>> > > > index should be documents).
>> > > >
>> > > > I never had any problems with Solr Cell when I was using a single
>> > > > core. In fact, I just ran the default installation in example/ and
>> > > > worked from that. However, trying to migrate to multi-core has been
>> > > > a never ending list of problems.
>> > > >
>> > > > Any time I try to add a document to the index (using the same curl
>> > > > command as I did to add to the single core, of course adding the core
>> > > > name to the request URL-- host/solr/corename/update/extract...), I
>> > > > get HTTP 500 errors due to classes not being found and/or lazy
>> > > > loading errors. I've copied the exact example/lib directory into the
>> > > > cores, and that doesn't work either.
>> > > >
>> > > > Frankly the only libraries I want are those relevant to indexing
>> > > > files. The less bloat, the better, after all. However, I cannot
>> > > > figure out where to put what files, and why the example installation
>> > > > works perfectly for single-core but not with multi-cores.
>> > > >
>> > > > Here is an example of the errors I'm receiving:
>> > > >
>> > > > command prompt> curl
>> > > > "host/solr/core0/update/extract?literal.id=2-3-1&commit=true" -F
>> > > > "myfile=@test2.txt"
>> > > >
>> > > > <html>
>> > > > <head>
>> > > > <meta http-equiv="Content-Type" content="text/html;
>> > > > charset=ISO-8859-1"/> <title>Error 500 </title>
>> > > > </head>
>> > > > <body><h2>HTTP ERROR:
>> > > > 500</h2><pre>org/apache/tika/exception/TikaException
>> > > >
>> > > > java.lang.NoClassDefFoundError:
>> > > > org/apache/tika/exception/TikaException at
>> > > > java.lang.Class.forName0(Native Method)
>> > > > at java.lang.Class.forName(Class.java:247)
>> > > > at
>> > > > org.apache.solr.core.SolrResourceLoader.findClass(SolrResourceLoader.
>> > > > java
>> > > >
>> > > > : 359) at
>> > > > : org.apache.solr.core.SolrCore.createInstance(SolrCore.java:413)
>> > > >
>> > > > at
>> > > > org.apache.solr.core.SolrCore.createRequestHandler(SolrCore.java:449
>> > > > ) at
>> > > > org.apache.solr.core.RequestHandlers$LazyRequestHandlerWrapper.getWra
>> > > > ppe dH andler(RequestHandlers.java:240) at
>> > > > org.apache.solr.core.RequestHandlers$LazyRequestHandlerWrapper.handle
>> > > > Requ e st(RequestHandlers.java:231) at
>> > > > org.apache.solr.core.SolrCore.execute(SolrCore.java:1316)
>> > > > at
>> > > > org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter
>> > > > .jav a
>> > > >
>> > > > :338) at
>> > > >
>> > > > org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilte
>> > > > r.ja v a:241) at
>> > > > org.mortbay.jetty.servlet.ServletHandler$CachedChain.doFilter(Servlet
>> > > > Hand l er.java:1089) at
>> > > > org.mortbay.jetty.servlet.ServletHandler.handle(ServletHandler.java:3
>> > > > 65) at
>> > > > org.mortbay.jetty.security.SecurityHandler.handle(SecurityHandler.jav
>> > > > a:21 6 ) at
>> > > > org.mortbay.jetty.servlet.SessionHandler.handle(SessionHandler.java:1
>> > > > 81) at
>> > > > org.mortbay.jetty.handler.ContextHandler.handle(ContextHandler.java:7
>> > > > 12) at
>> > > > org.mortbay.jetty.webapp.WebAppContext.handle(WebAppContext.java:405
>> > > > ) at
>> > > > org.mortbay.jetty.handler.ContextHandlerCollection.handle(ContextHand
>> > > > lerC o llection.java:211) at
>> > > > org.mortbay.jetty.handler.HandlerCollection.handle(HandlerCollection.
>> > > > java
>> > > >
>> > > > : 114) at
>> > > >
>> > > > org.mortbay.jetty.handler.HandlerWrapper.handle(HandlerWrapper.java:1
>> > > > 39) at org.mortbay.jetty.Server.handle(Server.java:285)
>> > > > at
>> > > > org.mortbay.jetty.HttpConnection.handleRequest(HttpConnection.java:50
>> > > > 2) at
>> > > > org.mortbay.jetty.HttpConnection$RequestHandler.content(HttpConnectio
>> > > > n.ja v a:835) at
>> > > > org.mortbay.jetty.HttpParser.parseNext(HttpParser.java:641) at
>> > > > org.mortbay.jetty.HttpParser.parseAvailable(HttpParser.java:202) at
>> > > > org.mortbay.jetty.HttpConnection.handle(HttpConnection.java:378) at
>> > > > org.mortbay.jetty.bio.SocketConnector$Connection.run(SocketConnector
>> > > > .java
>> > > >
>> > > > : 226) at
>> > > >
>> > > > org.mortbay.thread.BoundedThreadPool$PoolThread.run(BoundedThreadPool
>> > > > .jav a
>> > > >
>> > > > :442) Caused by: java.lang.ClassNotFoundException:
>> > > > org.apache.tika.exception.TikaException at
>> > > > java.net.URLClassLoader$1.run(URLClassLoader.java:202)
>> > > > at java.security.AccessController.doPrivileged(Native Method)
>> > > > at java.net.URLClassLoader.findClass(URLClassLoader.java:190)
>> > > > at java.lang.ClassLoader.loadClass(ClassLoader.java:307)
>> > > > at java.lang.ClassLoader.loadClass(ClassLoader.java:248)
>> > > > ... 27 more
>> > > > </pre>
>> > > > <p>RequestURI=/solr/core0/update/extract</p><p><i><small><a
>> > > > href="http://jetty.mortbay.org/">Powered by
>> > > > Jetty://</a></small></i></p><br/> <br/>
>> > > > <br/>
>> > > > <br/>
>> > > > <br/>
>> > > > <br/>
>> > > > <br/>
>> > > > <br/>
>> > > > <br/>
>> > > > <br/>
>> > > > <br/>
>> > > > <br/>
>> > > > <br/>
>> > > > <br/>
>> > > > <br/>
>> > > > <br/>
>> > > > <br/>
>> > > > <br/>
>> > > > <br/>
>> > > > <br/>
>> > > >
>> > > > </body>
>> > > > </html>
>> > > >
>> > > > Any assistance you could provide or installation
>> > > > guides/tutorials/etc. that you could link me to would be greatly
>> > > > appreciated. Thank you all for your time!
>> > > >
>> > > > ~Brandon Waterloo
>> > >
>> > > --
>> > > Markus Jelsma - CTO - Openindex
>> > > http://www.linkedin.com/in/markus17
>> > > 050-8536620 / 06-50258350
>>
>> ---
>> Enterprise Search Consultant at Sourcesense UK,
>> Making Sense of Open Source
>

RE: Multiple Cores with Solr Cell for indexing documents

Posted by Brandon Waterloo <Br...@matrix.msu.edu>.
I did finally manage to deploy Solr with multiple cores but we've been running into so many problems with permissions, index location, and other things that I (quite fortunately) convinced my boss that multiple cores are not the way to go here.  I had in place a single-core system that would filter the results based on their ID numbers, and show only the subset of results that you wanted to see.  The disadvantage is that it's a single core and thus will take longer to search over the entire index.  The advantage is that it's better in every other way.

So the plan now is to move back to single-core searching and then test it with a huge amount of documents to see whether performance is seriously impacted or not.  So for now, I guess we can consider this thread resolved.

Thanks for all your help guys!

~Brandon Waterloo


________________________________________
From: Markus Jelsma [markus.jelsma@openindex.io]
Sent: Friday, March 25, 2011 1:23 PM
To: solr-user@lucene.apache.org
Cc: Upayavira
Subject: Re: Multiple Cores with Solr Cell for indexing documents

You can only set properties for a lib dir that must be used in solrconfig.xml.
You can use sharedLib in solr.xml though.

> There's options in solr.xml that point to lib dirs. Make sure you get
> them right.
>
> Upayavira
>
> On Thu, 24 Mar 2011 23:28 +0100, "Markus Jelsma"
>
> <ma...@openindex.io> wrote:
> > I believe it's example/solr/lib where it looks for shared libs in
> > multicore.
> > But, each core can has its own lib dir, usually in core/lib. This is
> > referenced to in solrconfig.xml, see the example config for the lib
> > directive.
> >
> > > Well, there lies the problem--it's not JUST the Tika jar.  If it's not
> > > one thing, it's another, and I'm not even sure which directory Solr
> > > actually looks in.  In my Solr.xml file I have it use a shared library
> > > folder for every core.  Since each core will be holding very
> > > homologous data, there's no need to have any different library modules
> > > for each.
> > >
> > > The relevant line in my solr.xml file is <solr persistent="true"
> > > sharedLib="lib">.  That is housed in .../example/solr/.  So, does it
> > > look in .../example/lib or .../example/solr/lib?
> > >
> > > ~Brandon Waterloo
> > > ________________________________________
> > > From: Markus Jelsma [markus.jelsma@openindex.io]
> > > Sent: Thursday, March 24, 2011 11:29 AM
> > > To: solr-user@lucene.apache.org
> > > Cc: Brandon Waterloo
> > > Subject: Re: Multiple Cores with Solr Cell for indexing documents
> > >
> > > Sounds like the Tika jar is not on the class path. Add it to a
> > > directory where Solr's looking for libs.
> > >
> > > On Thursday 24 March 2011 16:24:17 Brandon Waterloo wrote:
> > > > Hello everyone,
> > > >
> > > > I've been trying for several hours now to set up Solr with multiple
> > > > cores with Solr Cell working on each core. The only items being
> > > > indexed are PDF, DOC, and TXT files (with the possibility of
> > > > expanding this list, but for now, just assume the only things in the
> > > > index should be documents).
> > > >
> > > > I never had any problems with Solr Cell when I was using a single
> > > > core. In fact, I just ran the default installation in example/ and
> > > > worked from that. However, trying to migrate to multi-core has been
> > > > a never ending list of problems.
> > > >
> > > > Any time I try to add a document to the index (using the same curl
> > > > command as I did to add to the single core, of course adding the core
> > > > name to the request URL-- host/solr/corename/update/extract...), I
> > > > get HTTP 500 errors due to classes not being found and/or lazy
> > > > loading errors. I've copied the exact example/lib directory into the
> > > > cores, and that doesn't work either.
> > > >
> > > > Frankly the only libraries I want are those relevant to indexing
> > > > files. The less bloat, the better, after all. However, I cannot
> > > > figure out where to put what files, and why the example installation
> > > > works perfectly for single-core but not with multi-cores.
> > > >
> > > > Here is an example of the errors I'm receiving:
> > > >
> > > > command prompt> curl
> > > > "host/solr/core0/update/extract?literal.id=2-3-1&commit=true" -F
> > > > "myfile=@test2.txt"
> > > >
> > > > <html>
> > > > <head>
> > > > <meta http-equiv="Content-Type" content="text/html;
> > > > charset=ISO-8859-1"/> <title>Error 500 </title>
> > > > </head>
> > > > <body><h2>HTTP ERROR:
> > > > 500</h2><pre>org/apache/tika/exception/TikaException
> > > >
> > > > java.lang.NoClassDefFoundError:
> > > > org/apache/tika/exception/TikaException at
> > > > java.lang.Class.forName0(Native Method)
> > > > at java.lang.Class.forName(Class.java:247)
> > > > at
> > > > org.apache.solr.core.SolrResourceLoader.findClass(SolrResourceLoader.
> > > > java
> > > >
> > > > : 359) at
> > > > : org.apache.solr.core.SolrCore.createInstance(SolrCore.java:413)
> > > >
> > > > at
> > > > org.apache.solr.core.SolrCore.createRequestHandler(SolrCore.java:449
> > > > ) at
> > > > org.apache.solr.core.RequestHandlers$LazyRequestHandlerWrapper.getWra
> > > > ppe dH andler(RequestHandlers.java:240) at
> > > > org.apache.solr.core.RequestHandlers$LazyRequestHandlerWrapper.handle
> > > > Requ e st(RequestHandlers.java:231) at
> > > > org.apache.solr.core.SolrCore.execute(SolrCore.java:1316)
> > > > at
> > > > org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter
> > > > .jav a
> > > >
> > > > :338) at
> > > >
> > > > org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilte
> > > > r.ja v a:241) at
> > > > org.mortbay.jetty.servlet.ServletHandler$CachedChain.doFilter(Servlet
> > > > Hand l er.java:1089) at
> > > > org.mortbay.jetty.servlet.ServletHandler.handle(ServletHandler.java:3
> > > > 65) at
> > > > org.mortbay.jetty.security.SecurityHandler.handle(SecurityHandler.jav
> > > > a:21 6 ) at
> > > > org.mortbay.jetty.servlet.SessionHandler.handle(SessionHandler.java:1
> > > > 81) at
> > > > org.mortbay.jetty.handler.ContextHandler.handle(ContextHandler.java:7
> > > > 12) at
> > > > org.mortbay.jetty.webapp.WebAppContext.handle(WebAppContext.java:405
> > > > ) at
> > > > org.mortbay.jetty.handler.ContextHandlerCollection.handle(ContextHand
> > > > lerC o llection.java:211) at
> > > > org.mortbay.jetty.handler.HandlerCollection.handle(HandlerCollection.
> > > > java
> > > >
> > > > : 114) at
> > > >
> > > > org.mortbay.jetty.handler.HandlerWrapper.handle(HandlerWrapper.java:1
> > > > 39) at org.mortbay.jetty.Server.handle(Server.java:285)
> > > > at
> > > > org.mortbay.jetty.HttpConnection.handleRequest(HttpConnection.java:50
> > > > 2) at
> > > > org.mortbay.jetty.HttpConnection$RequestHandler.content(HttpConnectio
> > > > n.ja v a:835) at
> > > > org.mortbay.jetty.HttpParser.parseNext(HttpParser.java:641) at
> > > > org.mortbay.jetty.HttpParser.parseAvailable(HttpParser.java:202) at
> > > > org.mortbay.jetty.HttpConnection.handle(HttpConnection.java:378) at
> > > > org.mortbay.jetty.bio.SocketConnector$Connection.run(SocketConnector
> > > > .java
> > > >
> > > > : 226) at
> > > >
> > > > org.mortbay.thread.BoundedThreadPool$PoolThread.run(BoundedThreadPool
> > > > .jav a
> > > >
> > > > :442) Caused by: java.lang.ClassNotFoundException:
> > > > org.apache.tika.exception.TikaException at
> > > > java.net.URLClassLoader$1.run(URLClassLoader.java:202)
> > > > at java.security.AccessController.doPrivileged(Native Method)
> > > > at java.net.URLClassLoader.findClass(URLClassLoader.java:190)
> > > > at java.lang.ClassLoader.loadClass(ClassLoader.java:307)
> > > > at java.lang.ClassLoader.loadClass(ClassLoader.java:248)
> > > > ... 27 more
> > > > </pre>
> > > > <p>RequestURI=/solr/core0/update/extract</p><p><i><small><a
> > > > href="http://jetty.mortbay.org/">Powered by
> > > > Jetty://</a></small></i></p><br/> <br/>
> > > > <br/>
> > > > <br/>
> > > > <br/>
> > > > <br/>
> > > > <br/>
> > > > <br/>
> > > > <br/>
> > > > <br/>
> > > > <br/>
> > > > <br/>
> > > > <br/>
> > > > <br/>
> > > > <br/>
> > > > <br/>
> > > > <br/>
> > > > <br/>
> > > > <br/>
> > > > <br/>
> > > >
> > > > </body>
> > > > </html>
> > > >
> > > > Any assistance you could provide or installation
> > > > guides/tutorials/etc. that you could link me to would be greatly
> > > > appreciated. Thank you all for your time!
> > > >
> > > > ~Brandon Waterloo
> > >
> > > --
> > > Markus Jelsma - CTO - Openindex
> > > http://www.linkedin.com/in/markus17
> > > 050-8536620 / 06-50258350
>
> ---
> Enterprise Search Consultant at Sourcesense UK,
> Making Sense of Open Source

Re: Multiple Cores with Solr Cell for indexing documents

Posted by Markus Jelsma <ma...@openindex.io>.
You can only set properties for a lib dir that must be used in solrconfig.xml. 
You can use sharedLib in solr.xml though.

> There's options in solr.xml that point to lib dirs. Make sure you get
> them right.
> 
> Upayavira
> 
> On Thu, 24 Mar 2011 23:28 +0100, "Markus Jelsma"
> 
> <ma...@openindex.io> wrote:
> > I believe it's example/solr/lib where it looks for shared libs in
> > multicore.
> > But, each core can has its own lib dir, usually in core/lib. This is
> > referenced to in solrconfig.xml, see the example config for the lib
> > directive.
> > 
> > > Well, there lies the problem--it's not JUST the Tika jar.  If it's not
> > > one thing, it's another, and I'm not even sure which directory Solr
> > > actually looks in.  In my Solr.xml file I have it use a shared library
> > > folder for every core.  Since each core will be holding very
> > > homologous data, there's no need to have any different library modules
> > > for each.
> > > 
> > > The relevant line in my solr.xml file is <solr persistent="true"
> > > sharedLib="lib">.  That is housed in .../example/solr/.  So, does it
> > > look in .../example/lib or .../example/solr/lib?
> > > 
> > > ~Brandon Waterloo
> > > ________________________________________
> > > From: Markus Jelsma [markus.jelsma@openindex.io]
> > > Sent: Thursday, March 24, 2011 11:29 AM
> > > To: solr-user@lucene.apache.org
> > > Cc: Brandon Waterloo
> > > Subject: Re: Multiple Cores with Solr Cell for indexing documents
> > > 
> > > Sounds like the Tika jar is not on the class path. Add it to a
> > > directory where Solr's looking for libs.
> > > 
> > > On Thursday 24 March 2011 16:24:17 Brandon Waterloo wrote:
> > > > Hello everyone,
> > > > 
> > > > I've been trying for several hours now to set up Solr with multiple
> > > > cores with Solr Cell working on each core. The only items being
> > > > indexed are PDF, DOC, and TXT files (with the possibility of
> > > > expanding this list, but for now, just assume the only things in the
> > > > index should be documents).
> > > > 
> > > > I never had any problems with Solr Cell when I was using a single
> > > > core. In fact, I just ran the default installation in example/ and
> > > > worked from that. However, trying to migrate to multi-core has been
> > > > a never ending list of problems.
> > > > 
> > > > Any time I try to add a document to the index (using the same curl
> > > > command as I did to add to the single core, of course adding the core
> > > > name to the request URL-- host/solr/corename/update/extract...), I
> > > > get HTTP 500 errors due to classes not being found and/or lazy
> > > > loading errors. I've copied the exact example/lib directory into the
> > > > cores, and that doesn't work either.
> > > > 
> > > > Frankly the only libraries I want are those relevant to indexing
> > > > files. The less bloat, the better, after all. However, I cannot
> > > > figure out where to put what files, and why the example installation
> > > > works perfectly for single-core but not with multi-cores.
> > > > 
> > > > Here is an example of the errors I'm receiving:
> > > > 
> > > > command prompt> curl
> > > > "host/solr/core0/update/extract?literal.id=2-3-1&commit=true" -F
> > > > "myfile=@test2.txt"
> > > > 
> > > > <html>
> > > > <head>
> > > > <meta http-equiv="Content-Type" content="text/html;
> > > > charset=ISO-8859-1"/> <title>Error 500 </title>
> > > > </head>
> > > > <body><h2>HTTP ERROR:
> > > > 500</h2><pre>org/apache/tika/exception/TikaException
> > > > 
> > > > java.lang.NoClassDefFoundError:
> > > > org/apache/tika/exception/TikaException at
> > > > java.lang.Class.forName0(Native Method)
> > > > at java.lang.Class.forName(Class.java:247)
> > > > at
> > > > org.apache.solr.core.SolrResourceLoader.findClass(SolrResourceLoader.
> > > > java
> > > > 
> > > > : 359) at
> > > > : org.apache.solr.core.SolrCore.createInstance(SolrCore.java:413)
> > > > 
> > > > at
> > > > org.apache.solr.core.SolrCore.createRequestHandler(SolrCore.java:449
> > > > ) at
> > > > org.apache.solr.core.RequestHandlers$LazyRequestHandlerWrapper.getWra
> > > > ppe dH andler(RequestHandlers.java:240) at
> > > > org.apache.solr.core.RequestHandlers$LazyRequestHandlerWrapper.handle
> > > > Requ e st(RequestHandlers.java:231) at
> > > > org.apache.solr.core.SolrCore.execute(SolrCore.java:1316)
> > > > at
> > > > org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter
> > > > .jav a
> > > > 
> > > > :338) at
> > > > 
> > > > org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilte
> > > > r.ja v a:241) at
> > > > org.mortbay.jetty.servlet.ServletHandler$CachedChain.doFilter(Servlet
> > > > Hand l er.java:1089) at
> > > > org.mortbay.jetty.servlet.ServletHandler.handle(ServletHandler.java:3
> > > > 65) at
> > > > org.mortbay.jetty.security.SecurityHandler.handle(SecurityHandler.jav
> > > > a:21 6 ) at
> > > > org.mortbay.jetty.servlet.SessionHandler.handle(SessionHandler.java:1
> > > > 81) at
> > > > org.mortbay.jetty.handler.ContextHandler.handle(ContextHandler.java:7
> > > > 12) at
> > > > org.mortbay.jetty.webapp.WebAppContext.handle(WebAppContext.java:405
> > > > ) at
> > > > org.mortbay.jetty.handler.ContextHandlerCollection.handle(ContextHand
> > > > lerC o llection.java:211) at
> > > > org.mortbay.jetty.handler.HandlerCollection.handle(HandlerCollection.
> > > > java
> > > > 
> > > > : 114) at
> > > > 
> > > > org.mortbay.jetty.handler.HandlerWrapper.handle(HandlerWrapper.java:1
> > > > 39) at org.mortbay.jetty.Server.handle(Server.java:285)
> > > > at
> > > > org.mortbay.jetty.HttpConnection.handleRequest(HttpConnection.java:50
> > > > 2) at
> > > > org.mortbay.jetty.HttpConnection$RequestHandler.content(HttpConnectio
> > > > n.ja v a:835) at
> > > > org.mortbay.jetty.HttpParser.parseNext(HttpParser.java:641) at
> > > > org.mortbay.jetty.HttpParser.parseAvailable(HttpParser.java:202) at
> > > > org.mortbay.jetty.HttpConnection.handle(HttpConnection.java:378) at
> > > > org.mortbay.jetty.bio.SocketConnector$Connection.run(SocketConnector
> > > > .java
> > > > 
> > > > : 226) at
> > > > 
> > > > org.mortbay.thread.BoundedThreadPool$PoolThread.run(BoundedThreadPool
> > > > .jav a
> > > > 
> > > > :442) Caused by: java.lang.ClassNotFoundException:
> > > > org.apache.tika.exception.TikaException at
> > > > java.net.URLClassLoader$1.run(URLClassLoader.java:202)
> > > > at java.security.AccessController.doPrivileged(Native Method)
> > > > at java.net.URLClassLoader.findClass(URLClassLoader.java:190)
> > > > at java.lang.ClassLoader.loadClass(ClassLoader.java:307)
> > > > at java.lang.ClassLoader.loadClass(ClassLoader.java:248)
> > > > ... 27 more
> > > > </pre>
> > > > <p>RequestURI=/solr/core0/update/extract</p><p><i><small><a
> > > > href="http://jetty.mortbay.org/">Powered by
> > > > Jetty://</a></small></i></p><br/> <br/>
> > > > <br/>
> > > > <br/>
> > > > <br/>
> > > > <br/>
> > > > <br/>
> > > > <br/>
> > > > <br/>
> > > > <br/>
> > > > <br/>
> > > > <br/>
> > > > <br/>
> > > > <br/>
> > > > <br/>
> > > > <br/>
> > > > <br/>
> > > > <br/>
> > > > <br/>
> > > > <br/>
> > > > 
> > > > </body>
> > > > </html>
> > > > 
> > > > Any assistance you could provide or installation
> > > > guides/tutorials/etc. that you could link me to would be greatly
> > > > appreciated. Thank you all for your time!
> > > > 
> > > > ~Brandon Waterloo
> > > 
> > > --
> > > Markus Jelsma - CTO - Openindex
> > > http://www.linkedin.com/in/markus17
> > > 050-8536620 / 06-50258350
> 
> ---
> Enterprise Search Consultant at Sourcesense UK,
> Making Sense of Open Source

Re: Multiple Cores with Solr Cell for indexing documents

Posted by Upayavira <uv...@odoko.co.uk>.
There's options in solr.xml that point to lib dirs. Make sure you get
them right.

Upayavira

On Thu, 24 Mar 2011 23:28 +0100, "Markus Jelsma"
<ma...@openindex.io> wrote:
> I believe it's example/solr/lib where it looks for shared libs in
> multicore. 
> But, each core can has its own lib dir, usually in core/lib. This is 
> referenced to in solrconfig.xml, see the example config for the lib
> directive.
> 
> > Well, there lies the problem--it's not JUST the Tika jar.  If it's not one
> > thing, it's another, and I'm not even sure which directory Solr actually
> > looks in.  In my Solr.xml file I have it use a shared library folder for
> > every core.  Since each core will be holding very homologous data, there's
> > no need to have any different library modules for each.
> > 
> > The relevant line in my solr.xml file is <solr persistent="true"
> > sharedLib="lib">.  That is housed in .../example/solr/.  So, does it look
> > in .../example/lib or .../example/solr/lib?
> > 
> > ~Brandon Waterloo
> > ________________________________________
> > From: Markus Jelsma [markus.jelsma@openindex.io]
> > Sent: Thursday, March 24, 2011 11:29 AM
> > To: solr-user@lucene.apache.org
> > Cc: Brandon Waterloo
> > Subject: Re: Multiple Cores with Solr Cell for indexing documents
> > 
> > Sounds like the Tika jar is not on the class path. Add it to a directory
> > where Solr's looking for libs.
> > 
> > On Thursday 24 March 2011 16:24:17 Brandon Waterloo wrote:
> > > Hello everyone,
> > > 
> > > I've been trying for several hours now to set up Solr with multiple cores
> > > with Solr Cell working on each core. The only items being indexed are
> > > PDF, DOC, and TXT files (with the possibility of expanding this list,
> > > but for now, just assume the only things in the index should be
> > > documents).
> > > 
> > > I never had any problems with Solr Cell when I was using a single core.
> > > In fact, I just ran the default installation in example/ and worked from
> > > that. However, trying to migrate to multi-core has been a never ending
> > > list of problems.
> > > 
> > > Any time I try to add a document to the index (using the same curl
> > > command as I did to add to the single core, of course adding the core
> > > name to the request URL-- host/solr/corename/update/extract...), I get
> > > HTTP 500 errors due to classes not being found and/or lazy loading
> > > errors. I've copied the exact example/lib directory into the cores, and
> > > that doesn't work either.
> > > 
> > > Frankly the only libraries I want are those relevant to indexing files.
> > > The less bloat, the better, after all. However, I cannot figure out
> > > where to put what files, and why the example installation works
> > > perfectly for single-core but not with multi-cores.
> > > 
> > > Here is an example of the errors I'm receiving:
> > > 
> > > command prompt> curl
> > > "host/solr/core0/update/extract?literal.id=2-3-1&commit=true" -F
> > > "myfile=@test2.txt"
> > > 
> > > <html>
> > > <head>
> > > <meta http-equiv="Content-Type" content="text/html; charset=ISO-8859-1"/>
> > > <title>Error 500 </title>
> > > </head>
> > > <body><h2>HTTP ERROR:
> > > 500</h2><pre>org/apache/tika/exception/TikaException
> > > 
> > > java.lang.NoClassDefFoundError: org/apache/tika/exception/TikaException
> > > at java.lang.Class.forName0(Native Method)
> > > at java.lang.Class.forName(Class.java:247)
> > > at
> > > org.apache.solr.core.SolrResourceLoader.findClass(SolrResourceLoader.java
> > > : 359) at org.apache.solr.core.SolrCore.createInstance(SolrCore.java:413)
> > > at org.apache.solr.core.SolrCore.createRequestHandler(SolrCore.java:449)
> > > at
> > > org.apache.solr.core.RequestHandlers$LazyRequestHandlerWrapper.getWrappe
> > > dH andler(RequestHandlers.java:240) at
> > > org.apache.solr.core.RequestHandlers$LazyRequestHandlerWrapper.handleRequ
> > > e st(RequestHandlers.java:231) at
> > > org.apache.solr.core.SolrCore.execute(SolrCore.java:1316)
> > > at
> > > org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.jav
> > > a
> > > 
> > > :338) at
> > > 
> > > org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.ja
> > > v a:241) at
> > > org.mortbay.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHand
> > > l er.java:1089) at
> > > org.mortbay.jetty.servlet.ServletHandler.handle(ServletHandler.java:365)
> > > at
> > > org.mortbay.jetty.security.SecurityHandler.handle(SecurityHandler.java:21
> > > 6 ) at
> > > org.mortbay.jetty.servlet.SessionHandler.handle(SessionHandler.java:181)
> > > at
> > > org.mortbay.jetty.handler.ContextHandler.handle(ContextHandler.java:712)
> > > at org.mortbay.jetty.webapp.WebAppContext.handle(WebAppContext.java:405)
> > > at
> > > org.mortbay.jetty.handler.ContextHandlerCollection.handle(ContextHandlerC
> > > o llection.java:211) at
> > > org.mortbay.jetty.handler.HandlerCollection.handle(HandlerCollection.java
> > > : 114) at
> > > org.mortbay.jetty.handler.HandlerWrapper.handle(HandlerWrapper.java:139)
> > > at org.mortbay.jetty.Server.handle(Server.java:285)
> > > at
> > > org.mortbay.jetty.HttpConnection.handleRequest(HttpConnection.java:502)
> > > at
> > > org.mortbay.jetty.HttpConnection$RequestHandler.content(HttpConnection.ja
> > > v a:835) at org.mortbay.jetty.HttpParser.parseNext(HttpParser.java:641)
> > > at org.mortbay.jetty.HttpParser.parseAvailable(HttpParser.java:202) at
> > > org.mortbay.jetty.HttpConnection.handle(HttpConnection.java:378) at
> > > org.mortbay.jetty.bio.SocketConnector$Connection.run(SocketConnector.java
> > > : 226) at
> > > org.mortbay.thread.BoundedThreadPool$PoolThread.run(BoundedThreadPool.jav
> > > a
> > > 
> > > :442) Caused by: java.lang.ClassNotFoundException:
> > > org.apache.tika.exception.TikaException at
> > > java.net.URLClassLoader$1.run(URLClassLoader.java:202)
> > > at java.security.AccessController.doPrivileged(Native Method)
> > > at java.net.URLClassLoader.findClass(URLClassLoader.java:190)
> > > at java.lang.ClassLoader.loadClass(ClassLoader.java:307)
> > > at java.lang.ClassLoader.loadClass(ClassLoader.java:248)
> > > ... 27 more
> > > </pre>
> > > <p>RequestURI=/solr/core0/update/extract</p><p><i><small><a
> > > href="http://jetty.mortbay.org/">Powered by
> > > Jetty://</a></small></i></p><br/> <br/>
> > > <br/>
> > > <br/>
> > > <br/>
> > > <br/>
> > > <br/>
> > > <br/>
> > > <br/>
> > > <br/>
> > > <br/>
> > > <br/>
> > > <br/>
> > > <br/>
> > > <br/>
> > > <br/>
> > > <br/>
> > > <br/>
> > > <br/>
> > > <br/>
> > > 
> > > </body>
> > > </html>
> > > 
> > > Any assistance you could provide or installation guides/tutorials/etc.
> > > that you could link me to would be greatly appreciated. Thank you all
> > > for your time!
> > > 
> > > ~Brandon Waterloo
> > 
> > --
> > Markus Jelsma - CTO - Openindex
> > http://www.linkedin.com/in/markus17
> > 050-8536620 / 06-50258350
> 
--- 
Enterprise Search Consultant at Sourcesense UK, 
Making Sense of Open Source


Re: Multiple Cores with Solr Cell for indexing documents

Posted by Markus Jelsma <ma...@openindex.io>.
I believe it's example/solr/lib where it looks for shared libs in multicore. 
But, each core can has its own lib dir, usually in core/lib. This is 
referenced to in solrconfig.xml, see the example config for the lib directive.

> Well, there lies the problem--it's not JUST the Tika jar.  If it's not one
> thing, it's another, and I'm not even sure which directory Solr actually
> looks in.  In my Solr.xml file I have it use a shared library folder for
> every core.  Since each core will be holding very homologous data, there's
> no need to have any different library modules for each.
> 
> The relevant line in my solr.xml file is <solr persistent="true"
> sharedLib="lib">.  That is housed in .../example/solr/.  So, does it look
> in .../example/lib or .../example/solr/lib?
> 
> ~Brandon Waterloo
> ________________________________________
> From: Markus Jelsma [markus.jelsma@openindex.io]
> Sent: Thursday, March 24, 2011 11:29 AM
> To: solr-user@lucene.apache.org
> Cc: Brandon Waterloo
> Subject: Re: Multiple Cores with Solr Cell for indexing documents
> 
> Sounds like the Tika jar is not on the class path. Add it to a directory
> where Solr's looking for libs.
> 
> On Thursday 24 March 2011 16:24:17 Brandon Waterloo wrote:
> > Hello everyone,
> > 
> > I've been trying for several hours now to set up Solr with multiple cores
> > with Solr Cell working on each core. The only items being indexed are
> > PDF, DOC, and TXT files (with the possibility of expanding this list,
> > but for now, just assume the only things in the index should be
> > documents).
> > 
> > I never had any problems with Solr Cell when I was using a single core.
> > In fact, I just ran the default installation in example/ and worked from
> > that. However, trying to migrate to multi-core has been a never ending
> > list of problems.
> > 
> > Any time I try to add a document to the index (using the same curl
> > command as I did to add to the single core, of course adding the core
> > name to the request URL-- host/solr/corename/update/extract...), I get
> > HTTP 500 errors due to classes not being found and/or lazy loading
> > errors. I've copied the exact example/lib directory into the cores, and
> > that doesn't work either.
> > 
> > Frankly the only libraries I want are those relevant to indexing files.
> > The less bloat, the better, after all. However, I cannot figure out
> > where to put what files, and why the example installation works
> > perfectly for single-core but not with multi-cores.
> > 
> > Here is an example of the errors I'm receiving:
> > 
> > command prompt> curl
> > "host/solr/core0/update/extract?literal.id=2-3-1&commit=true" -F
> > "myfile=@test2.txt"
> > 
> > <html>
> > <head>
> > <meta http-equiv="Content-Type" content="text/html; charset=ISO-8859-1"/>
> > <title>Error 500 </title>
> > </head>
> > <body><h2>HTTP ERROR:
> > 500</h2><pre>org/apache/tika/exception/TikaException
> > 
> > java.lang.NoClassDefFoundError: org/apache/tika/exception/TikaException
> > at java.lang.Class.forName0(Native Method)
> > at java.lang.Class.forName(Class.java:247)
> > at
> > org.apache.solr.core.SolrResourceLoader.findClass(SolrResourceLoader.java
> > : 359) at org.apache.solr.core.SolrCore.createInstance(SolrCore.java:413)
> > at org.apache.solr.core.SolrCore.createRequestHandler(SolrCore.java:449)
> > at
> > org.apache.solr.core.RequestHandlers$LazyRequestHandlerWrapper.getWrappe
> > dH andler(RequestHandlers.java:240) at
> > org.apache.solr.core.RequestHandlers$LazyRequestHandlerWrapper.handleRequ
> > e st(RequestHandlers.java:231) at
> > org.apache.solr.core.SolrCore.execute(SolrCore.java:1316)
> > at
> > org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.jav
> > a
> > 
> > :338) at
> > 
> > org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.ja
> > v a:241) at
> > org.mortbay.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHand
> > l er.java:1089) at
> > org.mortbay.jetty.servlet.ServletHandler.handle(ServletHandler.java:365)
> > at
> > org.mortbay.jetty.security.SecurityHandler.handle(SecurityHandler.java:21
> > 6 ) at
> > org.mortbay.jetty.servlet.SessionHandler.handle(SessionHandler.java:181)
> > at
> > org.mortbay.jetty.handler.ContextHandler.handle(ContextHandler.java:712)
> > at org.mortbay.jetty.webapp.WebAppContext.handle(WebAppContext.java:405)
> > at
> > org.mortbay.jetty.handler.ContextHandlerCollection.handle(ContextHandlerC
> > o llection.java:211) at
> > org.mortbay.jetty.handler.HandlerCollection.handle(HandlerCollection.java
> > : 114) at
> > org.mortbay.jetty.handler.HandlerWrapper.handle(HandlerWrapper.java:139)
> > at org.mortbay.jetty.Server.handle(Server.java:285)
> > at
> > org.mortbay.jetty.HttpConnection.handleRequest(HttpConnection.java:502)
> > at
> > org.mortbay.jetty.HttpConnection$RequestHandler.content(HttpConnection.ja
> > v a:835) at org.mortbay.jetty.HttpParser.parseNext(HttpParser.java:641)
> > at org.mortbay.jetty.HttpParser.parseAvailable(HttpParser.java:202) at
> > org.mortbay.jetty.HttpConnection.handle(HttpConnection.java:378) at
> > org.mortbay.jetty.bio.SocketConnector$Connection.run(SocketConnector.java
> > : 226) at
> > org.mortbay.thread.BoundedThreadPool$PoolThread.run(BoundedThreadPool.jav
> > a
> > 
> > :442) Caused by: java.lang.ClassNotFoundException:
> > org.apache.tika.exception.TikaException at
> > java.net.URLClassLoader$1.run(URLClassLoader.java:202)
> > at java.security.AccessController.doPrivileged(Native Method)
> > at java.net.URLClassLoader.findClass(URLClassLoader.java:190)
> > at java.lang.ClassLoader.loadClass(ClassLoader.java:307)
> > at java.lang.ClassLoader.loadClass(ClassLoader.java:248)
> > ... 27 more
> > </pre>
> > <p>RequestURI=/solr/core0/update/extract</p><p><i><small><a
> > href="http://jetty.mortbay.org/">Powered by
> > Jetty://</a></small></i></p><br/> <br/>
> > <br/>
> > <br/>
> > <br/>
> > <br/>
> > <br/>
> > <br/>
> > <br/>
> > <br/>
> > <br/>
> > <br/>
> > <br/>
> > <br/>
> > <br/>
> > <br/>
> > <br/>
> > <br/>
> > <br/>
> > <br/>
> > 
> > </body>
> > </html>
> > 
> > Any assistance you could provide or installation guides/tutorials/etc.
> > that you could link me to would be greatly appreciated. Thank you all
> > for your time!
> > 
> > ~Brandon Waterloo
> 
> --
> Markus Jelsma - CTO - Openindex
> http://www.linkedin.com/in/markus17
> 050-8536620 / 06-50258350

Multiple Cores with Solr Cell for indexing documents

Posted by Brandon Waterloo <Br...@matrix.msu.edu>.
Well, there lies the problem--it's not JUST the Tika jar.  If it's not one thing, it's another, and I'm not even sure which directory Solr actually looks in.  In my Solr.xml file I have it use a shared library folder for every core.  Since each core will be holding very homologous data, there's no need to have any different library modules for each.

The relevant line in my solr.xml file is <solr persistent="true" sharedLib="lib">.  That is housed in .../example/solr/.  So, does it look in .../example/lib or .../example/solr/lib?

~Brandon Waterloo
________________________________________
From: Markus Jelsma [markus.jelsma@openindex.io]
Sent: Thursday, March 24, 2011 11:29 AM
To: solr-user@lucene.apache.org
Cc: Brandon Waterloo
Subject: Re: Multiple Cores with Solr Cell for indexing documents

Sounds like the Tika jar is not on the class path. Add it to a directory where
Solr's looking for libs.

On Thursday 24 March 2011 16:24:17 Brandon Waterloo wrote:
> Hello everyone,
>
> I've been trying for several hours now to set up Solr with multiple cores
> with Solr Cell working on each core. The only items being indexed are PDF,
> DOC, and TXT files (with the possibility of expanding this list, but for
> now, just assume the only things in the index should be documents).
>
> I never had any problems with Solr Cell when I was using a single core. In
> fact, I just ran the default installation in example/ and worked from
> that. However, trying to migrate to multi-core has been a never ending
> list of problems.
>
> Any time I try to add a document to the index (using the same curl command
> as I did to add to the single core, of course adding the core name to the
> request URL-- host/solr/corename/update/extract...), I get HTTP 500 errors
> due to classes not being found and/or lazy loading errors. I've copied the
> exact example/lib directory into the cores, and that doesn't work either.
>
> Frankly the only libraries I want are those relevant to indexing files. The
> less bloat, the better, after all. However, I cannot figure out where to
> put what files, and why the example installation works perfectly for
> single-core but not with multi-cores.
>
> Here is an example of the errors I'm receiving:
>
> command prompt> curl
> "host/solr/core0/update/extract?literal.id=2-3-1&commit=true" -F
> "myfile=@test2.txt"
>
> <html>
> <head>
> <meta http-equiv="Content-Type" content="text/html; charset=ISO-8859-1"/>
> <title>Error 500 </title>
> </head>
> <body><h2>HTTP ERROR: 500</h2><pre>org/apache/tika/exception/TikaException
>
> java.lang.NoClassDefFoundError: org/apache/tika/exception/TikaException
> at java.lang.Class.forName0(Native Method)
> at java.lang.Class.forName(Class.java:247)
> at
> org.apache.solr.core.SolrResourceLoader.findClass(SolrResourceLoader.java:
> 359) at org.apache.solr.core.SolrCore.createInstance(SolrCore.java:413) at
> org.apache.solr.core.SolrCore.createRequestHandler(SolrCore.java:449) at
> org.apache.solr.core.RequestHandlers$LazyRequestHandlerWrapper.getWrappedH
> andler(RequestHandlers.java:240) at
> org.apache.solr.core.RequestHandlers$LazyRequestHandlerWrapper.handleReque
> st(RequestHandlers.java:231) at
> org.apache.solr.core.SolrCore.execute(SolrCore.java:1316)
> at
> org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java
> :338) at
> org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.jav
> a:241) at
> org.mortbay.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandl
> er.java:1089) at
> org.mortbay.jetty.servlet.ServletHandler.handle(ServletHandler.java:365)
> at
> org.mortbay.jetty.security.SecurityHandler.handle(SecurityHandler.java:216
> ) at
> org.mortbay.jetty.servlet.SessionHandler.handle(SessionHandler.java:181)
> at
> org.mortbay.jetty.handler.ContextHandler.handle(ContextHandler.java:712)
> at org.mortbay.jetty.webapp.WebAppContext.handle(WebAppContext.java:405)
> at
> org.mortbay.jetty.handler.ContextHandlerCollection.handle(ContextHandlerCo
> llection.java:211) at
> org.mortbay.jetty.handler.HandlerCollection.handle(HandlerCollection.java:
> 114) at
> org.mortbay.jetty.handler.HandlerWrapper.handle(HandlerWrapper.java:139)
> at org.mortbay.jetty.Server.handle(Server.java:285)
> at org.mortbay.jetty.HttpConnection.handleRequest(HttpConnection.java:502)
> at
> org.mortbay.jetty.HttpConnection$RequestHandler.content(HttpConnection.jav
> a:835) at org.mortbay.jetty.HttpParser.parseNext(HttpParser.java:641)
> at org.mortbay.jetty.HttpParser.parseAvailable(HttpParser.java:202)
> at org.mortbay.jetty.HttpConnection.handle(HttpConnection.java:378)
> at
> org.mortbay.jetty.bio.SocketConnector$Connection.run(SocketConnector.java:
> 226) at
> org.mortbay.thread.BoundedThreadPool$PoolThread.run(BoundedThreadPool.java
> :442) Caused by: java.lang.ClassNotFoundException:
> org.apache.tika.exception.TikaException at
> java.net.URLClassLoader$1.run(URLClassLoader.java:202)
> at java.security.AccessController.doPrivileged(Native Method)
> at java.net.URLClassLoader.findClass(URLClassLoader.java:190)
> at java.lang.ClassLoader.loadClass(ClassLoader.java:307)
> at java.lang.ClassLoader.loadClass(ClassLoader.java:248)
> ... 27 more
> </pre>
> <p>RequestURI=/solr/core0/update/extract</p><p><i><small><a
> href="http://jetty.mortbay.org/">Powered by
> Jetty://</a></small></i></p><br/> <br/>
> <br/>
> <br/>
> <br/>
> <br/>
> <br/>
> <br/>
> <br/>
> <br/>
> <br/>
> <br/>
> <br/>
> <br/>
> <br/>
> <br/>
> <br/>
> <br/>
> <br/>
> <br/>
>
> </body>
> </html>
>
> Any assistance you could provide or installation guides/tutorials/etc. that
> you could link me to would be greatly appreciated. Thank you all for your
> time!
>
> ~Brandon Waterloo

--
Markus Jelsma - CTO - Openindex
http://www.linkedin.com/in/markus17
050-8536620 / 06-50258350

RE: Multiple Cores with Solr Cell for indexing documents

Posted by Brandon Waterloo <Br...@matrix.msu.edu>.
Well, there lies the problem--it's not JUST the Tika jar.  If it's not one thing, it's another, and I'm not even sure which directory Solr actually looks in.  In my Solr.xml file I have it use a shared library folder for every core.  Since each core will be holding very homologous data, there's no need to have any different library modules for each.

The relevant line in my solr.xml file is <solr persistent="true" sharedLib="lib">.  That is housed in .../example/solr/.  So, does it look in .../example/lib or .../example/solr/lib?

~Brandon Waterloo
________________________________________
From: Markus Jelsma [markus.jelsma@openindex.io]
Sent: Thursday, March 24, 2011 11:29 AM
To: solr-user@lucene.apache.org
Cc: Brandon Waterloo
Subject: Re: Multiple Cores with Solr Cell for indexing documents

Sounds like the Tika jar is not on the class path. Add it to a directory where
Solr's looking for libs.

On Thursday 24 March 2011 16:24:17 Brandon Waterloo wrote:
> Hello everyone,
>
> I've been trying for several hours now to set up Solr with multiple cores
> with Solr Cell working on each core. The only items being indexed are PDF,
> DOC, and TXT files (with the possibility of expanding this list, but for
> now, just assume the only things in the index should be documents).
>
> I never had any problems with Solr Cell when I was using a single core. In
> fact, I just ran the default installation in example/ and worked from
> that. However, trying to migrate to multi-core has been a never ending
> list of problems.
>
> Any time I try to add a document to the index (using the same curl command
> as I did to add to the single core, of course adding the core name to the
> request URL-- host/solr/corename/update/extract...), I get HTTP 500 errors
> due to classes not being found and/or lazy loading errors. I've copied the
> exact example/lib directory into the cores, and that doesn't work either.
>
> Frankly the only libraries I want are those relevant to indexing files. The
> less bloat, the better, after all. However, I cannot figure out where to
> put what files, and why the example installation works perfectly for
> single-core but not with multi-cores.
>
> Here is an example of the errors I'm receiving:
>
> command prompt> curl
> "host/solr/core0/update/extract?literal.id=2-3-1&commit=true" -F
> "myfile=@test2.txt"
>
> <html>
> <head>
> <meta http-equiv="Content-Type" content="text/html; charset=ISO-8859-1"/>
> <title>Error 500 </title>
> </head>
> <body><h2>HTTP ERROR: 500</h2><pre>org/apache/tika/exception/TikaException
>
> java.lang.NoClassDefFoundError: org/apache/tika/exception/TikaException
> at java.lang.Class.forName0(Native Method)
> at java.lang.Class.forName(Class.java:247)
> at
> org.apache.solr.core.SolrResourceLoader.findClass(SolrResourceLoader.java:
> 359) at org.apache.solr.core.SolrCore.createInstance(SolrCore.java:413) at
> org.apache.solr.core.SolrCore.createRequestHandler(SolrCore.java:449) at
> org.apache.solr.core.RequestHandlers$LazyRequestHandlerWrapper.getWrappedH
> andler(RequestHandlers.java:240) at
> org.apache.solr.core.RequestHandlers$LazyRequestHandlerWrapper.handleReque
> st(RequestHandlers.java:231) at
> org.apache.solr.core.SolrCore.execute(SolrCore.java:1316)
> at
> org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java
> :338) at
> org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.jav
> a:241) at
> org.mortbay.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandl
> er.java:1089) at
> org.mortbay.jetty.servlet.ServletHandler.handle(ServletHandler.java:365)
> at
> org.mortbay.jetty.security.SecurityHandler.handle(SecurityHandler.java:216
> ) at
> org.mortbay.jetty.servlet.SessionHandler.handle(SessionHandler.java:181)
> at
> org.mortbay.jetty.handler.ContextHandler.handle(ContextHandler.java:712)
> at org.mortbay.jetty.webapp.WebAppContext.handle(WebAppContext.java:405)
> at
> org.mortbay.jetty.handler.ContextHandlerCollection.handle(ContextHandlerCo
> llection.java:211) at
> org.mortbay.jetty.handler.HandlerCollection.handle(HandlerCollection.java:
> 114) at
> org.mortbay.jetty.handler.HandlerWrapper.handle(HandlerWrapper.java:139)
> at org.mortbay.jetty.Server.handle(Server.java:285)
> at org.mortbay.jetty.HttpConnection.handleRequest(HttpConnection.java:502)
> at
> org.mortbay.jetty.HttpConnection$RequestHandler.content(HttpConnection.jav
> a:835) at org.mortbay.jetty.HttpParser.parseNext(HttpParser.java:641)
> at org.mortbay.jetty.HttpParser.parseAvailable(HttpParser.java:202)
> at org.mortbay.jetty.HttpConnection.handle(HttpConnection.java:378)
> at
> org.mortbay.jetty.bio.SocketConnector$Connection.run(SocketConnector.java:
> 226) at
> org.mortbay.thread.BoundedThreadPool$PoolThread.run(BoundedThreadPool.java
> :442) Caused by: java.lang.ClassNotFoundException:
> org.apache.tika.exception.TikaException at
> java.net.URLClassLoader$1.run(URLClassLoader.java:202)
> at java.security.AccessController.doPrivileged(Native Method)
> at java.net.URLClassLoader.findClass(URLClassLoader.java:190)
> at java.lang.ClassLoader.loadClass(ClassLoader.java:307)
> at java.lang.ClassLoader.loadClass(ClassLoader.java:248)
> ... 27 more
> </pre>
> <p>RequestURI=/solr/core0/update/extract</p><p><i><small><a
> href="http://jetty.mortbay.org/">Powered by
> Jetty://</a></small></i></p><br/> <br/>
> <br/>
> <br/>
> <br/>
> <br/>
> <br/>
> <br/>
> <br/>
> <br/>
> <br/>
> <br/>
> <br/>
> <br/>
> <br/>
> <br/>
> <br/>
> <br/>
> <br/>
> <br/>
>
> </body>
> </html>
>
> Any assistance you could provide or installation guides/tutorials/etc. that
> you could link me to would be greatly appreciated. Thank you all for your
> time!
>
> ~Brandon Waterloo

--
Markus Jelsma - CTO - Openindex
http://www.linkedin.com/in/markus17
050-8536620 / 06-50258350

Re: Multiple Cores with Solr Cell for indexing documents

Posted by Markus Jelsma <ma...@openindex.io>.
Sounds like the Tika jar is not on the class path. Add it to a directory where 
Solr's looking for libs.

On Thursday 24 March 2011 16:24:17 Brandon Waterloo wrote:
> Hello everyone,
> 
> I've been trying for several hours now to set up Solr with multiple cores
> with Solr Cell working on each core. The only items being indexed are PDF,
> DOC, and TXT files (with the possibility of expanding this list, but for
> now, just assume the only things in the index should be documents).
> 
> I never had any problems with Solr Cell when I was using a single core. In
> fact, I just ran the default installation in example/ and worked from
> that. However, trying to migrate to multi-core has been a never ending
> list of problems.
> 
> Any time I try to add a document to the index (using the same curl command
> as I did to add to the single core, of course adding the core name to the
> request URL-- host/solr/corename/update/extract...), I get HTTP 500 errors
> due to classes not being found and/or lazy loading errors. I've copied the
> exact example/lib directory into the cores, and that doesn't work either.
> 
> Frankly the only libraries I want are those relevant to indexing files. The
> less bloat, the better, after all. However, I cannot figure out where to
> put what files, and why the example installation works perfectly for
> single-core but not with multi-cores.
> 
> Here is an example of the errors I'm receiving:
> 
> command prompt> curl
> "host/solr/core0/update/extract?literal.id=2-3-1&commit=true" -F
> "myfile=@test2.txt"
> 
> <html>
> <head>
> <meta http-equiv="Content-Type" content="text/html; charset=ISO-8859-1"/>
> <title>Error 500 </title>
> </head>
> <body><h2>HTTP ERROR: 500</h2><pre>org/apache/tika/exception/TikaException
> 
> java.lang.NoClassDefFoundError: org/apache/tika/exception/TikaException
> at java.lang.Class.forName0(Native Method)
> at java.lang.Class.forName(Class.java:247)
> at
> org.apache.solr.core.SolrResourceLoader.findClass(SolrResourceLoader.java:
> 359) at org.apache.solr.core.SolrCore.createInstance(SolrCore.java:413) at
> org.apache.solr.core.SolrCore.createRequestHandler(SolrCore.java:449) at
> org.apache.solr.core.RequestHandlers$LazyRequestHandlerWrapper.getWrappedH
> andler(RequestHandlers.java:240) at
> org.apache.solr.core.RequestHandlers$LazyRequestHandlerWrapper.handleReque
> st(RequestHandlers.java:231) at
> org.apache.solr.core.SolrCore.execute(SolrCore.java:1316)
> at
> org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java
> :338) at
> org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.jav
> a:241) at
> org.mortbay.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandl
> er.java:1089) at
> org.mortbay.jetty.servlet.ServletHandler.handle(ServletHandler.java:365)
> at
> org.mortbay.jetty.security.SecurityHandler.handle(SecurityHandler.java:216
> ) at
> org.mortbay.jetty.servlet.SessionHandler.handle(SessionHandler.java:181)
> at
> org.mortbay.jetty.handler.ContextHandler.handle(ContextHandler.java:712)
> at org.mortbay.jetty.webapp.WebAppContext.handle(WebAppContext.java:405)
> at
> org.mortbay.jetty.handler.ContextHandlerCollection.handle(ContextHandlerCo
> llection.java:211) at
> org.mortbay.jetty.handler.HandlerCollection.handle(HandlerCollection.java:
> 114) at
> org.mortbay.jetty.handler.HandlerWrapper.handle(HandlerWrapper.java:139)
> at org.mortbay.jetty.Server.handle(Server.java:285)
> at org.mortbay.jetty.HttpConnection.handleRequest(HttpConnection.java:502)
> at
> org.mortbay.jetty.HttpConnection$RequestHandler.content(HttpConnection.jav
> a:835) at org.mortbay.jetty.HttpParser.parseNext(HttpParser.java:641)
> at org.mortbay.jetty.HttpParser.parseAvailable(HttpParser.java:202)
> at org.mortbay.jetty.HttpConnection.handle(HttpConnection.java:378)
> at
> org.mortbay.jetty.bio.SocketConnector$Connection.run(SocketConnector.java:
> 226) at
> org.mortbay.thread.BoundedThreadPool$PoolThread.run(BoundedThreadPool.java
> :442) Caused by: java.lang.ClassNotFoundException:
> org.apache.tika.exception.TikaException at
> java.net.URLClassLoader$1.run(URLClassLoader.java:202)
> at java.security.AccessController.doPrivileged(Native Method)
> at java.net.URLClassLoader.findClass(URLClassLoader.java:190)
> at java.lang.ClassLoader.loadClass(ClassLoader.java:307)
> at java.lang.ClassLoader.loadClass(ClassLoader.java:248)
> ... 27 more
> </pre>
> <p>RequestURI=/solr/core0/update/extract</p><p><i><small><a
> href="http://jetty.mortbay.org/">Powered by
> Jetty://</a></small></i></p><br/> <br/>
> <br/>
> <br/>
> <br/>
> <br/>
> <br/>
> <br/>
> <br/>
> <br/>
> <br/>
> <br/>
> <br/>
> <br/>
> <br/>
> <br/>
> <br/>
> <br/>
> <br/>
> <br/>
> 
> </body>
> </html>
> 
> Any assistance you could provide or installation guides/tutorials/etc. that
> you could link me to would be greatly appreciated. Thank you all for your
> time!
> 
> ~Brandon Waterloo

-- 
Markus Jelsma - CTO - Openindex
http://www.linkedin.com/in/markus17
050-8536620 / 06-50258350

Multiple Cores with Solr Cell for indexing documents

Posted by Brandon Waterloo <Br...@matrix.msu.edu>.
Hello everyone,

I've been trying for several hours now to set up Solr with multiple cores with Solr Cell working on each core. The only items being indexed are PDF, DOC, and TXT files (with the possibility of expanding this list, but for now, just assume the only things in the index should be documents).

I never had any problems with Solr Cell when I was using a single core. In fact, I just ran the default installation in example/ and worked from that. However, trying to migrate to multi-core has been a never ending list of problems.

Any time I try to add a document to the index (using the same curl command as I did to add to the single core, of course adding the core name to the request URL-- host/solr/corename/update/extract...), I get HTTP 500 errors due to classes not being found and/or lazy loading errors. I've copied the exact example/lib directory into the cores, and that doesn't work either.

Frankly the only libraries I want are those relevant to indexing files. The less bloat, the better, after all. However, I cannot figure out where to put what files, and why the example installation works perfectly for single-core but not with multi-cores.

Here is an example of the errors I'm receiving:

command prompt> curl "host/solr/core0/update/extract?literal.id=2-3-1&commit=true" -F "myfile=@test2.txt"

<html>
<head>
<meta http-equiv="Content-Type" content="text/html; charset=ISO-8859-1"/>
<title>Error 500 </title>
</head>
<body><h2>HTTP ERROR: 500</h2><pre>org/apache/tika/exception/TikaException

java.lang.NoClassDefFoundError: org/apache/tika/exception/TikaException
at java.lang.Class.forName0(Native Method)
at java.lang.Class.forName(Class.java:247)
at org.apache.solr.core.SolrResourceLoader.findClass(SolrResourceLoader.java:359)
at org.apache.solr.core.SolrCore.createInstance(SolrCore.java:413)
at org.apache.solr.core.SolrCore.createRequestHandler(SolrCore.java:449)
at org.apache.solr.core.RequestHandlers$LazyRequestHandlerWrapper.getWrappedHandler(RequestHandlers.java:240)
at org.apache.solr.core.RequestHandlers$LazyRequestHandlerWrapper.handleRequest(RequestHandlers.java:231)
at org.apache.solr.core.SolrCore.execute(SolrCore.java:1316)
at org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java:338)
at org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:241)
at org.mortbay.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1089)
at org.mortbay.jetty.servlet.ServletHandler.handle(ServletHandler.java:365)
at org.mortbay.jetty.security.SecurityHandler.handle(SecurityHandler.java:216)
at org.mortbay.jetty.servlet.SessionHandler.handle(SessionHandler.java:181)
at org.mortbay.jetty.handler.ContextHandler.handle(ContextHandler.java:712)
at org.mortbay.jetty.webapp.WebAppContext.handle(WebAppContext.java:405)
at org.mortbay.jetty.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:211)
at org.mortbay.jetty.handler.HandlerCollection.handle(HandlerCollection.java:114)
at org.mortbay.jetty.handler.HandlerWrapper.handle(HandlerWrapper.java:139)
at org.mortbay.jetty.Server.handle(Server.java:285)
at org.mortbay.jetty.HttpConnection.handleRequest(HttpConnection.java:502)
at org.mortbay.jetty.HttpConnection$RequestHandler.content(HttpConnection.java:835)
at org.mortbay.jetty.HttpParser.parseNext(HttpParser.java:641)
at org.mortbay.jetty.HttpParser.parseAvailable(HttpParser.java:202)
at org.mortbay.jetty.HttpConnection.handle(HttpConnection.java:378)
at org.mortbay.jetty.bio.SocketConnector$Connection.run(SocketConnector.java:226)
at org.mortbay.thread.BoundedThreadPool$PoolThread.run(BoundedThreadPool.java:442)
Caused by: java.lang.ClassNotFoundException: org.apache.tika.exception.TikaException
at java.net.URLClassLoader$1.run(URLClassLoader.java:202)
at java.security.AccessController.doPrivileged(Native Method)
at java.net.URLClassLoader.findClass(URLClassLoader.java:190)
at java.lang.ClassLoader.loadClass(ClassLoader.java:307)
at java.lang.ClassLoader.loadClass(ClassLoader.java:248)
... 27 more
</pre>
<p>RequestURI=/solr/core0/update/extract</p><p><i><small><a href="http://jetty.mortbay.org/">Powered by Jetty://</a></small></i></p><br/>
<br/>
<br/>
<br/>
<br/>
<br/>
<br/>
<br/>
<br/>
<br/>
<br/>
<br/>
<br/>
<br/>
<br/>
<br/>
<br/>
<br/>
<br/>
<br/>

</body>
</html>

Any assistance you could provide or installation guides/tutorials/etc. that you could link me to would be greatly appreciated. Thank you all for your time!

~Brandon Waterloo