You are viewing a plain text version of this content. The canonical link for it is here.
Posted to solr-user@lucene.apache.org by Tom Weber <to...@rtl.lu> on 2006/09/06 16:02:09 UTC

Double Solr Installation on Single Tomcat (or Double Index)

Hello,

   I need to have a second separate index (separate data) on the same  
server.

   Is there a possibility to do this in a single solr install on a  
tomcat server or do I need to have a second instance in the same  
tomcat install ?

   If either one is possible, does somebody has some advice how to  
set this up, and how to be sure that both indexes do not interact ?

   Many thanks for any help,

   Best Greetings,

   Tom

Re: SolrCore as Singleton?

Posted by Eivind Hasle Amundsen <ei...@ifi.uio.no>.
Tim Archambault wrote:
> In regard to the comment about lack of an interface, I view this as a
> benefit of the tool.
> 
> Whether I'm developing with Python, PHP, Coldfusion, .NET, Java, etc.
> I can create my own customizable interface. As a coldfusion programmer
> with moderate programming capabilities, this tool is perfect for my
> needs.

That's good to hear. I never meant that a GUI should replace anything at 
all. Did it come out that way?

As the product evolves, it is only natural to add capabilities and 
features. Some of these should be available from different interfaces, 
including GUI(s). However one should be able to interface with the 
application at different levels. When Solr gets more complex over time, 
care must be taken so it does not get complicated. There might be 
numerous more points of entry into a more complex product. It is 
necessary to keep things simple as well as providing centralized 
configuration possibilities. Following this philosophy, Solr users will 
be able to choose their level of interaction.

(In a metaphor, some people prefer using GNU/Linux just by installing a 
distro; others compile and become best friends with the command line.)

Eivind

Re: Re: SolrCore as Singleton?

Posted by Tim Archambault <ta...@bangordailynews.net>.
In regard to the comment about lack of an interface, I view this as a
benefit of the tool.

Whether I'm developing with Python, PHP, Coldfusion, .NET, Java, etc.
I can create my own customizable interface. As a coldfusion programmer
with moderate programming capabilities, this tool is perfect for my
needs.



On 9/8/06, Andrew May <am...@ingenta.com> wrote:
> Chris Hostetter wrote:
> > : Nice.  Is the same doable under Jetty? (never had to deal with JNDI
> > : under Jetty)
> >
> > i haven't tried it personally, but according to Yoav "reading" JNDI
> > options is part of hte Servlet Spec, and billa found a refrene to
> > useing "<env-entry>" to do so...
> >
> > http://www.nabble.com/Re%3A-multiple-solr-webapps-p3991310.html
> >
> > ...where exactly that option goes in Jetty's configuration isn't something
> > i'm clear on.
> >
>
> <env-entry> values go in web.xml, so it would mean having modified versions of solr.war
> for each collection.
>
> <env-entry> is an optional part of the Servlet spec for standalone servlet
> implementations. The basic version of Jetty does not have any JNDI support, you need to
> use JettyPlus (http://jetty.mortbay.org/jetty5/plus/index.html) for that.
>
> -Andrew
>

Re: SolrCore as Singleton?

Posted by Andrew May <am...@ingenta.com>.
Chris Hostetter wrote:
> : Nice.  Is the same doable under Jetty? (never had to deal with JNDI
> : under Jetty)
> 
> i haven't tried it personally, but according to Yoav "reading" JNDI
> options is part of hte Servlet Spec, and billa found a refrene to
> useing "<env-entry>" to do so...
> 
> http://www.nabble.com/Re%3A-multiple-solr-webapps-p3991310.html
> 
> ...where exactly that option goes in Jetty's configuration isn't something
> i'm clear on.
> 

<env-entry> values go in web.xml, so it would mean having modified versions of solr.war 
for each collection.

<env-entry> is an optional part of the Servlet spec for standalone servlet 
implementations. The basic version of Jetty does not have any JNDI support, you need to 
use JettyPlus (http://jetty.mortbay.org/jetty5/plus/index.html) for that.

-Andrew

Re: SolrCore as Singleton?

Posted by Chris Hostetter <ho...@fucit.org>.
: Nice.  Is the same doable under Jetty? (never had to deal with JNDI
: under Jetty)

i haven't tried it personally, but according to Yoav "reading" JNDI
options is part of hte Servlet Spec, and billa found a refrene to
useing "<env-entry>" to do so...

http://www.nabble.com/Re%3A-multiple-solr-webapps-p3991310.html

...where exactly that option goes in Jetty's configuration isn't something
i'm clear on.


: ----- Original Message ----
: From: Chris Hostetter <ho...@fucit.org>
: To: solr-user@lucene.apache.org
: Sent: Friday, September 8, 2006 1:46:19 AM
: Subject: Re: SolrCore as Singleton?
:
:
: : I am currently in the startup phase of my thesis regarding open source
: : and enterprise search. After having worked at perhaps the leading major
: : enterprise search company, I have the impression that multiple
: : collections is a very common feature (and very sought-after). It is a
: : trend I see not just directly from my work, but most certainly also as a
: : result of enterprise search solutions becoming more common in general.
:
: SolrCore being a singleton doesn't prevent you from having multiple
: collections per JVM -- you just need to run multiple instances of the
: webapp within a single servlet container using JNDI to specify the
: seperate solr.home directories, specifics for doing this in Tomcat are on
: the wiki...
:    http://wiki.apache.org/solr/SolrTomcat
:
: : Until this framework is available with its appropriate configuration
: : files, administrator interface and so on in place, it seems a bit
: : unnatural to support multiple collections from the same application
: : instance.
: :
: : Bottom line (for now): I think that users looking for enterprise search
: : solutions must have a simple way of creating multiple collections from
: : within the same application.
:
: Well it's pretty easy right now to make a new collection -- it's
: just two new files (solrconfig.xml and schema.xml)
:
:
:
:
: -Hoss
:
:
:
:



-Hoss


Re: SolrCore as Singleton?

Posted by Otis Gospodnetic <ot...@yahoo.com>.
Nice.  Is the same doable under Jetty? (never had to deal with JNDI under Jetty)

Otis

----- Original Message ----
From: Chris Hostetter <ho...@fucit.org>
To: solr-user@lucene.apache.org
Sent: Friday, September 8, 2006 1:46:19 AM
Subject: Re: SolrCore as Singleton?


: I am currently in the startup phase of my thesis regarding open source
: and enterprise search. After having worked at perhaps the leading major
: enterprise search company, I have the impression that multiple
: collections is a very common feature (and very sought-after). It is a
: trend I see not just directly from my work, but most certainly also as a
: result of enterprise search solutions becoming more common in general.

SolrCore being a singleton doesn't prevent you from having multiple
collections per JVM -- you just need to run multiple instances of the
webapp within a single servlet container using JNDI to specify the
seperate solr.home directories, specifics for doing this in Tomcat are on
the wiki...
   http://wiki.apache.org/solr/SolrTomcat

: Until this framework is available with its appropriate configuration
: files, administrator interface and so on in place, it seems a bit
: unnatural to support multiple collections from the same application
: instance.
:
: Bottom line (for now): I think that users looking for enterprise search
: solutions must have a simple way of creating multiple collections from
: within the same application.

Well it's pretty easy right now to make a new collection -- it's
just two new files (solrconfig.xml and schema.xml)




-Hoss





Re: SolrCore as Singleton?

Posted by Chris Hostetter <ho...@fucit.org>.
: I am currently in the startup phase of my thesis regarding open source
: and enterprise search. After having worked at perhaps the leading major
: enterprise search company, I have the impression that multiple
: collections is a very common feature (and very sought-after). It is a
: trend I see not just directly from my work, but most certainly also as a
: result of enterprise search solutions becoming more common in general.

SolrCore being a singleton doesn't prevent you from having multiple
collections per JVM -- you just need to run multiple instances of the
webapp within a single servlet container using JNDI to specify the
seperate solr.home directories, specifics for doing this in Tomcat are on
the wiki...
   http://wiki.apache.org/solr/SolrTomcat

: Until this framework is available with its appropriate configuration
: files, administrator interface and so on in place, it seems a bit
: unnatural to support multiple collections from the same application
: instance.
:
: Bottom line (for now): I think that users looking for enterprise search
: solutions must have a simple way of creating multiple collections from
: within the same application.

Well it's pretty easy right now to make a new collection -- it's
just two new files (solrconfig.xml and schema.xml)




-Hoss


Re: SolrCore as Singleton?

Posted by Eivind Hasle Amundsen <ei...@ifi.uio.no>.
Chris Hostetter wrote:
> I'm going to sidestep the issue of wether there *was* a good reason for
> it, as well as the "does the singleton pattern make sense for the current
> usage" question and answer what i think is an equally significant
> question: "what are the implications of trying to change it now?" ... the
> biggest i can think of being that SolrConfig is also a static singleton,
> and a *lot* of code in the Solr code base would need to be changed to
> support multiple SolrConfigs ... and without multiple SolrConfigs, there
> really isnt' any reason to have multiple SolrCores.

This actually underlines that my guess was right to a certain extent. 
Changing from singleton is not straightforward.

I am currently in the startup phase of my thesis regarding open source 
and enterprise search. After having worked at perhaps the leading major 
enterprise search company, I have the impression that multiple 
collections is a very common feature (and very sought-after). It is a 
trend I see not just directly from my work, but most certainly also as a 
result of enterprise search solutions becoming more common in general.

However I must say that Solr seems to be approaching the problem from a 
very logical angle. What really is missing is a more abstract layer, 
call it application framework, that probably will come afterwards 
anyway. This will perhaps evolve naturally as part of the Solr project 
at a later stage, or perhaps even as a separate open source project 
building on Solr.

Until this framework is available with its appropriate configuration 
files, administrator interface and so on in place, it seems a bit 
unnatural to support multiple collections from the same application 
instance.

Bottom line (for now): I think that users looking for enterprise search 
solutions must have a simple way of creating multiple collections from 
within the same application.

I apologize for my very philosophical e-mail, but I tend to become 
somewhat visionary and conceptual after a few beers, and this might not 
be the perfect forum for these discussions(?) :)

Eivind

Re: SolrCore as Singleton?

Posted by Chris Hostetter <ho...@fucit.org>.
: Is there a good reason for implementing SolrCore as a Singleton?

I'm going to sidestep the issue of wether there *was* a good reason for
it, as well as the "does the singleton pattern make sense for the current
usage" question and answer what i think is an equally significant
question: "what are the implications of trying to change it now?" ... the
biggest i can think of being that SolrConfig is also a static singleton,
and a *lot* of code in the Solr code base would need to be changed to
support multiple SolrConfigs ... and without multiple SolrConfigs, there
really isnt' any reason to have multiple SolrCores.



-Hoss


Re: SolrCore as Singleton?

Posted by Eivind Hasle Amundsen <ei...@ifi.uio.no>.
> If there is no specific reason for making it a Singleton, I'd vote for 
> removing this so that the
> SolrCore(dataDir, schema) constructor could be used to instantiate 
> multiple cores.

I agree with your arguments. However (although being new to Solr) there 
is more than one way to do it, I think.

To be more specific it seems that using several different indexes with 
individual datadirs and schemas is very useful, based on my impression 
that many enterprise users seem to want this functionality. It is not 
difficult to imagine such a usage pattern or implementation, in its 
abstract sense, of Solr for almost all uses.

However (and this is where most you guys should fill me in), it could be 
wasteful to run multiple complete instances. Could information be shared 
in some way between the instances to save on resources? Perhaps what I 
am really trying to say here, is that we have to look at the whole model 
when considering how to implement better support for the desired usage 
pattern outlined above.

Eivind

SolrCore as Singleton?

Posted by Joachim Martin <jm...@path-works.com>.
Is there a good reason for implementing SolrCore as a Singleton?

We are experimenting with running Solr as a Spring service embedded in 
our app.  Since it is a Singleton
we cannot have more than one index (not currently  a problem, but could be).

I note the comment:

  // Singleton for now...

If there is no specific reason for making it a Singleton, I'd vote for 
removing this so that the
SolrCore(dataDir, schema) constructor could be used to instantiate 
multiple cores.

Seems to me that since the primary usage scenario of solr is access via 
REST (i.e. no Solr jar/API),
the Singleton pattern is not necessary here.

--Joachim

Re: Double Solr Installation on Single Tomcat (or Double Index)

Posted by Przemysław Brzozowski <br...@interia.pl>.

Tom Weber napisał(a):
> Hello,
>
>   I need to have a second separate index (separate data) on the same 
> server.
>
>   Is there a possibility to do this in a single solr install on a 
> tomcat server or do I need to have a second instance in the same 
> tomcat install ?
>
You will need separate instances within the same Tomcat.

>   If either one is possible, does somebody has some advice how to set 
> this up, and how to be sure that both indexes do not interact ?
>
Create context xml file for each SOLR application in the folder : 
CATALINA_HOME\conf\Catalina\localhost\context_name.xml.



<Context docBase="${catalina.home}/..../solr.war" debug="0" 
crossContext="true">
    <Environment name="solr/home" type="java.lang.String" 
value="${catalina.home}\solr_data_files\" override="true" />
  </Context>

Adjust docBase to point at solr.war.
Adjust solr/home to point at solr_data_files - different folders for 
different SOLR instances.

If the context xml file name is called solr1.xml then you can acces that 
solr instance using following url  http://host:port/solr1/admin.

>   Many thanks for any help,
>
>   Best Greetings,
>
>   Tom
>
>

----------------------------------------------------------------------
200 zlotych bonusu w eliminacjach do Mistrzostw Europy dla Ciebie! 
BETWAY.com >>> http://link.interia.pl/f199f



Re: Double Solr Installation on Single Tomcat (or Double Index)

Posted by Yonik Seeley <yo...@apache.org>.
Another way to run multiple solr webapps with Tomcat involves context
fragments.  It allows you to use a single copy of the solr.war but
specify different configs (via different solrhomes).

http://wiki.apache.org/solr/SolrTomcat

-Yonik


On 9/6/06, sangraal aiken <sa...@gmail.com> wrote:
> I've set up 2 separate Solr indexes on one Tomcat instance. I basically
> created two separate Solr webapps. I have one webapp that is the client to
> both Solr instances as well. So the whole setup is 3 webapps.
>
> I have one set of Solr source classes and an ant task to build a jar file
> and copy it into the lib directory of both Solr webapps. This way if you
> customize your Solr installs you only have to do it once. Each Solr webapp
> obviously needs it's own solr config and data directories which is
> configurable through solrConfig. Both indexes are completely separate and
> configurable independently through these config files.
>
> If you need more detail let me know, I'll try to help you out.
>
> -S
>
> On 9/6/06, Tom Weber <to...@rtl.lu> wrote:
> >
> > Hello,
> >
> >    I need to have a second separate index (separate data) on the same
> > server.
> >
> >    Is there a possibility to do this in a single solr install on a
> > tomcat server or do I need to have a second instance in the same
> > tomcat install ?
> >
> >    If either one is possible, does somebody has some advice how to
> > set this up, and how to be sure that both indexes do not interact ?
> >
> >    Many thanks for any help,
> >
> >    Best Greetings,
> >
> >    Tom
> >

Re: Double Solr Installation on Single Tomcat (or Double Index)

Posted by sangraal aiken <sa...@gmail.com>.
I've set up 2 separate Solr indexes on one Tomcat instance. I basically
created two separate Solr webapps. I have one webapp that is the client to
both Solr instances as well. So the whole setup is 3 webapps.

I have one set of Solr source classes and an ant task to build a jar file
and copy it into the lib directory of both Solr webapps. This way if you
customize your Solr installs you only have to do it once. Each Solr webapp
obviously needs it's own solr config and data directories which is
configurable through solrConfig. Both indexes are completely separate and
configurable independently through these config files.

If you need more detail let me know, I'll try to help you out.

-S

On 9/6/06, Tom Weber <to...@rtl.lu> wrote:
>
> Hello,
>
>    I need to have a second separate index (separate data) on the same
> server.
>
>    Is there a possibility to do this in a single solr install on a
> tomcat server or do I need to have a second instance in the same
> tomcat install ?
>
>    If either one is possible, does somebody has some advice how to
> set this up, and how to be sure that both indexes do not interact ?
>
>    Many thanks for any help,
>
>    Best Greetings,
>
>    Tom
>

Re: Got it working! And some questions

Posted by James liu <li...@gmail.com>.
- Is the solr php in the wiki working out of the box for anyone?
show your php.ini. did you performance your php?




2006/9/10, Brian Lucas <bl...@gmail.com>:
>
> Hi Michael,
>
> I apologize for the lack of testing on the SolPHP.  I had to "strip" it
> down
> significantly to turn it into a general class that would be usable and the
> version up there has not been extensively tested yet (I'm almost ready to
> get back to that and "revise" it), plus much of my coding is done in Rails
> at the moment.  However...
>
> If you have a new version, could you send it over my way or just upload it
> to the wiki?  I'd like to take a look at the changes and throw your
> revised
> version up there or integrate both versions into a cleaner revision of the
> version already there.
>
> With respect to batch queries, it's already designed to do that (that's
> why
> you see "array($array)" in the example, because it accepts an array of
> updates) but I'd definitely like to see how you revised it.
>
> Thanks,
> Brian
>
>
> -----Original Message-----
> From: Michael Imbeault [mailto:michael.imbeault@sympatico.ca]
> Sent: Saturday, September 09, 2006 12:30 PM
> To: solr-user@lucene.apache.org
> Subject: Got it working! And some questions
>
> First of all, in reference to
> http://www.mail-archive.com/solr-user@lucene.apache.org/msg00808.html ,
> I got it working! The problem(s) was coming from solPHP; the
> implementation in the wiki isn't really working, to be honest, at least
> for me. I had to modify it significantly at multiple places to get it
> working. Tomcat 5.5, WAMP and Windows XP.
>
> The main problem was that addIndex was sending 1 doc at a time to solr;
> it would cause a problem after a few thousand docs because i was running
> out of resources. I modified solr_update.php to handle batch queries,
> and i'm now sending batches of 1000 docs at a time. Great indexing speed.
>
> Had a slight problem with the curl function of solr_update.php; the
> custom HTTP header wasn't recognized; I now use curl_setopt($ch,
> CURLOPT_POST, 1); curl_setopt($ch, CURLOPT_POSTFIELDS, $post_string); -
> much simpler, and now everything works!
>
> Up so far I indexed 15.000.000 documents (my whole collection,
> basically) and the performance i'm getting is INCREDIBLE (sub 100ms
> query time without warmup and no optimization at all on a 7 gigs index -
> and with the cache, it gets stupid fast)! Seriously, Solr amaze me every
> time I use it. I increased HashDocSet Maxsize to 75000, will continue to
> optimize this value - it helped a great deal. I will try disMaxHandler
> soon too; right now the standard one is great. And I will index with a
> better stopword file; the default one could really use improvements.
>
> Some questions (couldn't find the answer in the docs):
>
> - Is the solr php in the wiki working out of the box for anyone? Else we
> could modify the wiki...
>
> - What is the loadFactor variable of HashDocSet? Should I optimize it too?
>
> - What's the units on the size value of the caches? Megs, number of
> queries, kilobytes? Not described anywhere.
>
> - Any way to programatically change the OR/AND preference of the query
> parser? I set it to AND by default for user queries, but i'd like to set
> it to OR for some server-side queries I must do (find related articles,
> order by score).
>
> - Whats the difference between the 2 commits type? Blocking and
> non-blocking. Didn't see any differences at all, tried both.
>
> - Every time I do an <optimize> command, I get the following in my
> catalina logs - should I do anything about it?
>
> 9-Sep-2006 2:24:40 PM org.apache.solr.core.SolrException log
> SEVERE: Exception during commit/optimize:java.io.EOFException: no more
> data available - expected end tag </optimize> to close start tag
> <optimize> from line 1, parser stopped on START_TAG seen <optimize>...
> @1:10
>
> - Any benefits of setting the allowed memory for Tomcat higher? Right
> now im allocating 384 megs.
>
> Can't wait to try the new Faceted Queries... seriously, solr is really,
> really awesome up so far. Thanks for all your work, and sorry for all
> the questions!
>
> --
> Michael Imbeault
> CHUL Research Center (CHUQ)
> 2705 boul. Laurier
> Ste-Foy, QC, Canada, G1V 4G2
> Tel: (418) 654-2705, Fax: (418) 654-2212
>
>

Re: Got it working! And some questions

Posted by Yonik Seeley <yo...@apache.org>.
On 9/9/06, Michael Imbeault <mi...@sympatico.ca> wrote:
> The main problem was that addIndex was sending 1 doc at a time to solr;
> it would cause a problem after a few thousand docs because i was running
> out of resources.

Sending one doc at a time should be fine... you shouldn't run out of
resources.
There must be a bug somewhere...

-Yonik

Re: Got it working! And some questions

Posted by Chris Hostetter <ho...@fucit.org>.
: First of all, it seems the mailing list is having some troubles? Some of
: my posts end up in the wrong thread (even new threads I post), I don't
: receive them in my mail, and they're present only in the 'date archive'
: of http://www.mail-archive.com, and not in the 'thread' one? I don't
: receive some of the other peoples post in my mail too, problems started
: last week I think.

i haven't noticed any problems with mail not making it through - some mail
clients (gmail for example) seem to supress messages they can tell you
sent, maybe that'swhat's happening on your end?  As for
threads you start not showing up on the "thread" list ... according to
my mailbox, all but one message i've recieved from you included a
"References:" header (if not a In-Reply-To header) which causes some mail
archivers to assume it's part of an existing thread (this thread for
instance is considered part of the "Double Solr Installation on Single
Tomcat (or Double Index)" thread) ... you may wnat to experiement with
your mail client (off list) to see if you can figure out when/why this
happening.

: Secondly, Chris, thanks for all the useful answers, everything is much
: clearer now. This info should be added to the wiki I think; should I do

feel free ... that's why it's a wiki.

: it? I'm still a little disappointed that I can't change the OR/AND
: parsing by just changing some parameter (like I can do for the number of
: results returned, for example); adding a OR between each word in the
: text i want to compare sounds suboptimal, but i'll probably do it that
: way; its a very minor nitpick, solr is awesome, as I said before.

it would be a fairly simple option to add just like changing the
default field (patches welcome!) but as i said -- typcially if you don't
want the default behavior you are programaticaly generating the query
anyway, and already adding some markup, a little more doesn't make it less
optimal.





-Hoss


Re: Got it working! And some questions

Posted by Chris Hostetter <ho...@fucit.org>.
: Maybe something like q.op or q.oper if it *only* applies to q.  Which
: begs the question... what *does* it apply to?  At first blush, it
: doesn't seem like it should apply to other queries like fq, facet
: queries, and esp queries defined in solrconfig.xml.  I think that
: would be very surprising.

agreed not the comment i put into SolrPluginUtils.parseFilterQueries when
i add fq support to StandardRequestHandler...

    /* Ignore SolrParams.DF - could have init param FQs assuming the
     * schema default with query param DF intented to only affect Q.
     * If user doesn't want schema default, they should be explicit in the FQ.
     */

... i would think a "do" or "op" or "q.op" param should *definitely* only
influence the "q" param.





-Hoss


Re: Got it working! And some questions

Posted by Chris Hostetter <ho...@fucit.org>.
: SolrQueryParser now knows nothing about the default operator, it is
: set from QueryParsing.parseQuery() when passed a SolrParams.

i didn't test it, but it looks clean to me.

the only other thing i would do is beaf up the javadocs for
SolrQueryParser (to clarify that IndexSchema is only used for determining
field format) and QueryParsing.parseQuery (to clarify that it *does* use
IndexSearcher to get extra parsing options).

: QueryParsing.parseQuery() methods could be simplified, perhaps even
	...
: It could even get the "q" parameter from there, but there is code
: that passes expressions that don't come from "q".  Maybe we could

...yeha, it's utility for simple queries regardless of the "primary"
language of a request handler is key.

: have two parseQuery() methods:  parseQuery(String expression,
: SolrQueryRequest req) and parseQuery(SolrQueryRequest req), and for
: the latter the "q" parameter is pulled from the request and used as
: the expression.

That sounds good to me ... but it doesn't seem critical ... clean house as
much as you want, but i don't think anybody else will mind a bit of dust
on the window sills.



-Hoss


Re: Got it working! And some questions

Posted by Erik Hatcher <er...@ehatchersolutions.com>.
On Sep 12, 2006, at 4:47 PM, Chris Hostetter wrote:
> : I've implemented the ability to override the default operator with
> : q.op=AND|OR.  The patch is pasted below for your review.
>
> if i'm reading that right, one subtlety is that "new
> SolrQueryParser(schema,field)" no longer pas attention to
> schema.getQueryParserDefaultOperator() -- that only only becomes
> applicable when using QueryParsing.parseQuery
>
> ...i am very okay with this change, i wasn't really a fan of the  
> fact that
> the SolrQueryParser pulled that info out of the IndexSchema in it's
> constructor previously, i just wanted to point out that this patch  
> would
> change that.
>
> Perhaps the constructor for SolrQueryParser shouldn't be aware of  
> the op
> at all (either from the schema or from the SolrParams) -- and  
> setting it
> should be left to QueryParsing.parseQuery (or some other utility in  
> the
> QueryParsing class) ... personally i'm a fan of leaving  
> SolrQueryParser as
> much like QueryParser as possible -- with the only real change  
> being the
> knowledege of hte individual field formats.

I've reworked it based on your feedback.  The patch is pasted below.

SolrQueryParser now knows nothing about the default operator, it is  
set from QueryParsing.parseQuery() when passed a SolrParams.

QueryParsing.parseQuery() methods could be simplified, perhaps even  
into a single method, that took a query expression and a  
SolrQueryRequest, where it can get the SolrParams and  IndexSchema.   
It could even get the "q" parameter from there, but there is code  
that passes expressions that don't come from "q".  Maybe we could  
have two parseQuery() methods:  parseQuery(String expression,  
SolrQueryRequest req) and parseQuery(SolrQueryRequest req), and for  
the latter the "q" parameter is pulled from the request and used as  
the expression.

As it is, the patch below works fine and I'm happy to commit it, but  
am happy to rework this sort of thing to get it as clean as others like.

	Erik


Index: src/java/org/apache/solr/search/SolrQueryParser.java
===================================================================
--- src/java/org/apache/solr/search/SolrQueryParser.java	(revision  
442772)
+++ src/java/org/apache/solr/search/SolrQueryParser.java	(working copy)
@@ -37,7 +37,6 @@
      super(defaultField == null ? schema.getDefaultSearchFieldName 
() : defaultField, schema.getQueryAnalyzer());
      this.schema = schema;
      setLowercaseExpandedTerms(false);
-    setDefaultOperator("AND".equals 
(schema.getQueryParserDefaultOperator()) ? QueryParser.Operator.AND :  
QueryParser.Operator.OR);
    }
    protected Query getFieldQuery(String field, String queryText)  
throws ParseException {
Index: src/java/org/apache/solr/search/QueryParsing.java
===================================================================
--- src/java/org/apache/solr/search/QueryParsing.java	(revision 442772)
+++ src/java/org/apache/solr/search/QueryParsing.java	(working copy)
@@ -19,6 +19,7 @@
import org.apache.lucene.search.*;
import org.apache.solr.search.function.*;
import org.apache.lucene.queryParser.ParseException;
+import org.apache.lucene.queryParser.QueryParser;
import org.apache.lucene.document.Field;
import org.apache.lucene.index.Term;
import org.apache.solr.core.SolrCore;
@@ -26,6 +27,7 @@
import org.apache.solr.schema.IndexSchema;
import org.apache.solr.schema.SchemaField;
import org.apache.solr.schema.FieldType;
+import org.apache.solr.request.SolrParams;
import java.util.ArrayList;
import java.util.regex.Pattern;
@@ -37,6 +39,7 @@
   * @version $Id$
   */
public class QueryParsing {
+  public static final String OP = "q.op";
    public static Query parseQuery(String qs, IndexSchema schema) {
      return parseQuery(qs, null, schema);
@@ -58,8 +61,26 @@
      }
    }
+  public static Query parseQuery(String qs, String defaultField,  
SolrParams params, IndexSchema schema) {
+    try {
+      String opParam = params.get(OP,  
schema.getQueryParserDefaultOperator());
+      QueryParser.Operator defaultOperator = "AND".equals(opParam) ?  
QueryParser.Operator.AND : QueryParser.Operator.OR;
+      SolrQueryParser parser = new SolrQueryParser(schema,  
defaultField);
+      parser.setDefaultOperator(defaultOperator);
+      Query query = parser.parse(qs);
+      if (SolrCore.log.isLoggable(Level.FINEST)) {
+        SolrCore.log.finest("After QueryParser:" + query);
+      }
+      return query;
+
+    } catch (ParseException e) {
+      SolrCore.log(e);
+      throw new SolrException(400,"Error parsing Lucene query",e);
+    }
+  }
+
    /***
     * SortSpec encapsulates a Lucene Sort and a count of the number  
of documents
     * to return.
Index: src/java/org/apache/solr/request/StandardRequestHandler.java
===================================================================
--- src/java/org/apache/solr/request/StandardRequestHandler.java	 
(revision 442772)
+++ src/java/org/apache/solr/request/StandardRequestHandler.java	 
(working copy)
@@ -105,7 +105,7 @@
        List<String> commands = StrUtils.splitSmart(sreq,';');
        String qs = commands.size() >= 1 ? commands.get(0) : "";
-      Query query = QueryParsing.parseQuery(qs, defaultField,  
req.getSchema());
+      Query query = QueryParsing.parseQuery(qs, defaultField, p,  
req.getSchema());
        // If the first non-query, non-filter command is a simple  
sort on an indexed field, then
        // we can use the Lucene sort ability.


Re: Got it working! And some questions

Posted by Chris Hostetter <ho...@fucit.org>.
: I've implemented the ability to override the default operator with
: q.op=AND|OR.  The patch is pasted below for your review.

if i'm reading that right, one subtlety is that "new
SolrQueryParser(schema,field)" no longer pas attention to
schema.getQueryParserDefaultOperator() -- that only only becomes
applicable when using QueryParsing.parseQuery

...i am very okay with this change, i wasn't really a fan of the fact that
the SolrQueryParser pulled that info out of the IndexSchema in it's
constructor previously, i just wanted to point out that this patch would
change that.

Perhaps the constructor for SolrQueryParser shouldn't be aware of the op
at all (either from the schema or from the SolrParams) -- and setting it
should be left to QueryParsing.parseQuery (or some other utility in the
QueryParsing class) ... personally i'm a fan of leaving SolrQueryParser as
much like QueryParser as possible -- with the only real change being the
knowledege of hte individual field formats.


: Index: src/java/org/apache/solr/search/SolrQueryParser.java
: ===================================================================
: --- src/java/org/apache/solr/search/SolrQueryParser.java	(revision
: 442689)
: +++ src/java/org/apache/solr/search/SolrQueryParser.java	(working copy)
: @@ -34,10 +34,14 @@
:     protected final IndexSchema schema;
:     public SolrQueryParser(IndexSchema schema, String defaultField) {
: +    this(schema, defaultField, QueryParser.Operator.OR);
: +  }
: +
: +  public SolrQueryParser(IndexSchema schema, String defaultField,
: QueryParser.Operator defaultOperator) {
:       super(defaultField == null ? schema.getDefaultSearchFieldName
: () : defaultField, schema.getQueryAnalyzer());
:       this.schema = schema;
:       setLowercaseExpandedTerms(false);
: -    setDefaultOperator("AND".equals
: (schema.getQueryParserDefaultOperator()) ? QueryParser.Operator.AND :
: QueryParser.Operator.OR);
: +    setDefaultOperator(defaultOperator);
:     }
:     protected Query getFieldQuery(String field, String queryText)
: throws ParseException {
: Index: src/java/org/apache/solr/search/QueryParsing.java
: ===================================================================
: --- src/java/org/apache/solr/search/QueryParsing.java	(revision 442689)
: +++ src/java/org/apache/solr/search/QueryParsing.java	(working copy)
: @@ -19,6 +19,7 @@
: import org.apache.lucene.search.*;
: import org.apache.solr.search.function.*;
: import org.apache.lucene.queryParser.ParseException;
: +import org.apache.lucene.queryParser.QueryParser;
: import org.apache.lucene.document.Field;
: import org.apache.lucene.index.Term;
: import org.apache.solr.core.SolrCore;
: @@ -26,6 +27,7 @@
: import org.apache.solr.schema.IndexSchema;
: import org.apache.solr.schema.SchemaField;
: import org.apache.solr.schema.FieldType;
: +import org.apache.solr.request.SolrParams;
: import java.util.ArrayList;
: import java.util.regex.Pattern;
: @@ -37,6 +39,7 @@
:    * @version $Id$
:    */
: public class QueryParsing {
: +  public static final String OP = "q.op";
:     public static Query parseQuery(String qs, IndexSchema schema) {
:       return parseQuery(qs, null, schema);
: @@ -58,8 +61,24 @@
:       }
:     }
: +  public static Query parseQuery(String qs, String defaultField,
: SolrParams params, IndexSchema schema) {
: +    try {
: +      String opParam = params.get(OP,
: schema.getQueryParserDefaultOperator());
: +      QueryParser.Operator defaultOperator = "AND".equals(opParam) ?
: QueryParser.Operator.AND : QueryParser.Operator.OR;
: +      Query query = new SolrQueryParser(schema, defaultField,
: defaultOperator).parse(qs);
: +      if (SolrCore.log.isLoggable(Level.FINEST)) {
: +        SolrCore.log.finest("After QueryParser:" + query);
: +      }
: +      return query;
: +
: +    } catch (ParseException e) {
: +      SolrCore.log(e);
: +      throw new SolrException(400,"Error parsing Lucene query",e);
: +    }
: +  }
: +
:     /***
:      * SortSpec encapsulates a Lucene Sort and a count of the number
: of documents
:      * to return.
: Index: src/java/org/apache/solr/request/StandardRequestHandler.java
: ===================================================================
: --- src/java/org/apache/solr/request/StandardRequestHandler.java
: (revision 442689)
: +++ src/java/org/apache/solr/request/StandardRequestHandler.java
: (working copy)
: @@ -94,7 +94,7 @@
:         List<String> commands = StrUtils.splitSmart(sreq,';');
:         String qs = commands.size() >= 1 ? commands.get(0) : "";
: -      Query query = QueryParsing.parseQuery(qs, defaultField,
: req.getSchema());
: +      Query query = QueryParsing.parseQuery(qs, defaultField, p,
: req.getSchema());
:         // If the first non-query, non-filter command is a simple
: sort on an indexed field, then
:         // we can use the Lucene sort ability.
:



-Hoss


Re: Got it working! And some questions

Posted by Erik Hatcher <er...@ehatchersolutions.com>.
On Sep 11, 2006, at 2:52 PM, Yonik Seeley wrote:

> On 9/11/06, Erik Hatcher <er...@ehatchersolutions.com> wrote:
>>
>> On Sep 10, 2006, at 10:47 PM, Michael Imbeault wrote:
>> >  I'm still a little disappointed that I can't change the OR/AND
>> > parsing by just changing some parameter (like I can do for the
>> > number of results returned, for example); adding a OR between each
>> > word in the text i want to compare sounds suboptimal, but i'll
>> > probably do it that way; its a very minor nitpick, solr is awesome,
>> > as I said before.
>>
>> I'm the one that added support for controlling the default operator
>> of Solr's query parser, and I hadn't considered the use case of
>> controlling that setting from a request parameter.  It should be easy
>> enough to add.  I'll take a look at adding that support and commit it
>> once I have it working.
>>
>> What parameter name should be used for this?    do=[AND|OR] (for
>> default operator)?  We have df for default field.
>
> Maybe something like q.op or q.oper if it *only* applies to q.  Which
> begs the question... what *does* it apply to?  At first blush, it
> doesn't seem like it should apply to other queries like fq, facet
> queries, and esp queries defined in solrconfig.xml.  I think that
> would be very surprising.

I've implemented the ability to override the default operator with  
q.op=AND|OR.  The patch is pasted below for your review.

The one thing I don't like is that QueryParsing.parseQuery(String qs,  
String defaultField, SolrParams params, IndexSchema schema) is a bit  
redundant in that it takes defaultField which can also be gleaned  
from params, but StandardRequestHandler uses "df" for highlighting also.

I'm happy to commit this if there are no objections or suggestions  
for improvement (and of course update the wiki documentation for the  
parameters).

	Erik



Index: src/java/org/apache/solr/search/SolrQueryParser.java
===================================================================
--- src/java/org/apache/solr/search/SolrQueryParser.java	(revision  
442689)
+++ src/java/org/apache/solr/search/SolrQueryParser.java	(working copy)
@@ -34,10 +34,14 @@
    protected final IndexSchema schema;
    public SolrQueryParser(IndexSchema schema, String defaultField) {
+    this(schema, defaultField, QueryParser.Operator.OR);
+  }
+
+  public SolrQueryParser(IndexSchema schema, String defaultField,  
QueryParser.Operator defaultOperator) {
      super(defaultField == null ? schema.getDefaultSearchFieldName 
() : defaultField, schema.getQueryAnalyzer());
      this.schema = schema;
      setLowercaseExpandedTerms(false);
-    setDefaultOperator("AND".equals 
(schema.getQueryParserDefaultOperator()) ? QueryParser.Operator.AND :  
QueryParser.Operator.OR);
+    setDefaultOperator(defaultOperator);
    }
    protected Query getFieldQuery(String field, String queryText)  
throws ParseException {
Index: src/java/org/apache/solr/search/QueryParsing.java
===================================================================
--- src/java/org/apache/solr/search/QueryParsing.java	(revision 442689)
+++ src/java/org/apache/solr/search/QueryParsing.java	(working copy)
@@ -19,6 +19,7 @@
import org.apache.lucene.search.*;
import org.apache.solr.search.function.*;
import org.apache.lucene.queryParser.ParseException;
+import org.apache.lucene.queryParser.QueryParser;
import org.apache.lucene.document.Field;
import org.apache.lucene.index.Term;
import org.apache.solr.core.SolrCore;
@@ -26,6 +27,7 @@
import org.apache.solr.schema.IndexSchema;
import org.apache.solr.schema.SchemaField;
import org.apache.solr.schema.FieldType;
+import org.apache.solr.request.SolrParams;
import java.util.ArrayList;
import java.util.regex.Pattern;
@@ -37,6 +39,7 @@
   * @version $Id$
   */
public class QueryParsing {
+  public static final String OP = "q.op";
    public static Query parseQuery(String qs, IndexSchema schema) {
      return parseQuery(qs, null, schema);
@@ -58,8 +61,24 @@
      }
    }
+  public static Query parseQuery(String qs, String defaultField,  
SolrParams params, IndexSchema schema) {
+    try {
+      String opParam = params.get(OP,  
schema.getQueryParserDefaultOperator());
+      QueryParser.Operator defaultOperator = "AND".equals(opParam) ?  
QueryParser.Operator.AND : QueryParser.Operator.OR;
+      Query query = new SolrQueryParser(schema, defaultField,  
defaultOperator).parse(qs);
+      if (SolrCore.log.isLoggable(Level.FINEST)) {
+        SolrCore.log.finest("After QueryParser:" + query);
+      }
+      return query;
+
+    } catch (ParseException e) {
+      SolrCore.log(e);
+      throw new SolrException(400,"Error parsing Lucene query",e);
+    }
+  }
+
    /***
     * SortSpec encapsulates a Lucene Sort and a count of the number  
of documents
     * to return.
Index: src/java/org/apache/solr/request/StandardRequestHandler.java
===================================================================
--- src/java/org/apache/solr/request/StandardRequestHandler.java	 
(revision 442689)
+++ src/java/org/apache/solr/request/StandardRequestHandler.java	 
(working copy)
@@ -94,7 +94,7 @@
        List<String> commands = StrUtils.splitSmart(sreq,';');
        String qs = commands.size() >= 1 ? commands.get(0) : "";
-      Query query = QueryParsing.parseQuery(qs, defaultField,  
req.getSchema());
+      Query query = QueryParsing.parseQuery(qs, defaultField, p,  
req.getSchema());
        // If the first non-query, non-filter command is a simple  
sort on an indexed field, then
        // we can use the Lucene sort ability.


Re: Got it working! And some questions

Posted by Yonik Seeley <yo...@apache.org>.
On 9/11/06, Erik Hatcher <er...@ehatchersolutions.com> wrote:
>
> On Sep 10, 2006, at 10:47 PM, Michael Imbeault wrote:
> >  I'm still a little disappointed that I can't change the OR/AND
> > parsing by just changing some parameter (like I can do for the
> > number of results returned, for example); adding a OR between each
> > word in the text i want to compare sounds suboptimal, but i'll
> > probably do it that way; its a very minor nitpick, solr is awesome,
> > as I said before.
>
> I'm the one that added support for controlling the default operator
> of Solr's query parser, and I hadn't considered the use case of
> controlling that setting from a request parameter.  It should be easy
> enough to add.  I'll take a look at adding that support and commit it
> once I have it working.
>
> What parameter name should be used for this?    do=[AND|OR] (for
> default operator)?  We have df for default field.

Maybe something like q.op or q.oper if it *only* applies to q.  Which
begs the question... what *does* it apply to?  At first blush, it
doesn't seem like it should apply to other queries like fq, facet
queries, and esp queries defined in solrconfig.xml.  I think that
would be very surprising.

-Yonik

Re: Got it working! And some questions

Posted by Michael Imbeault <mi...@sympatico.ca>.
Hello Erik,

Thanks for add that feature! "do" is fine with me, if "op" is already 
used (not sure about this one).

Erik Hatcher wrote:
>
> On Sep 10, 2006, at 10:47 PM, Michael Imbeault wrote:
>>  I'm still a little disappointed that I can't change the OR/AND 
>> parsing by just changing some parameter (like I can do for the number 
>> of results returned, for example); adding a OR between each word in 
>> the text i want to compare sounds suboptimal, but i'll probably do it 
>> that way; its a very minor nitpick, solr is awesome, as I said before.
>
> I'm the one that added support for controlling the default operator of 
> Solr's query parser, and I hadn't considered the use case of 
> controlling that setting from a request parameter.  It should be easy 
> enough to add.  I'll take a look at adding that support and commit it 
> once I have it working.
>
> What parameter name should be used for this?    do=[AND|OR] (for 
> default operator)?  We have df for default field.
>
>     Erik
>
-- 
Michael Imbeault
CHUL Research Center (CHUQ)
2705 boul. Laurier
Ste-Foy, QC, Canada, G1V 4G2
Tel: (418) 654-2705, Fax: (418) 654-2212


Re: Got it working! And some questions

Posted by Erik Hatcher <er...@ehatchersolutions.com>.
On Sep 10, 2006, at 10:47 PM, Michael Imbeault wrote:
>  I'm still a little disappointed that I can't change the OR/AND  
> parsing by just changing some parameter (like I can do for the  
> number of results returned, for example); adding a OR between each  
> word in the text i want to compare sounds suboptimal, but i'll  
> probably do it that way; its a very minor nitpick, solr is awesome,  
> as I said before.

I'm the one that added support for controlling the default operator  
of Solr's query parser, and I hadn't considered the use case of  
controlling that setting from a request parameter.  It should be easy  
enough to add.  I'll take a look at adding that support and commit it  
once I have it working.

What parameter name should be used for this?    do=[AND|OR] (for  
default operator)?  We have df for default field.

	Erik


Re: Got it working! And some questions

Posted by Michael Imbeault <mi...@sympatico.ca>.
First of all, it seems the mailing list is having some troubles? Some of 
my posts end up in the wrong thread (even new threads I post), I don't 
receive them in my mail, and they're present only in the 'date archive' 
of http://www.mail-archive.com, and not in the 'thread' one? I don't 
receive some of the other peoples post in my mail too, problems started 
last week I think.

Secondly, Chris, thanks for all the useful answers, everything is much 
clearer now. This info should be added to the wiki I think; should I do 
it? I'm still a little disappointed that I can't change the OR/AND 
parsing by just changing some parameter (like I can do for the number of 
results returned, for example); adding a OR between each word in the 
text i want to compare sounds suboptimal, but i'll probably do it that 
way; its a very minor nitpick, solr is awesome, as I said before.

@ Brian Lucas: Don't worry, solrPHP was still 99.9% functional, great 
work; part of it sending a doc at a time was my fault; I was following 
the exact sequence (add to array, submit) displayed in the docs. The 
only thing that could be added is a big "//TODO: change this code" 
before sections you have to change to make it work for a particular 
schema. I'm pretty sure the custom header curl submit works for everyone 
else than me; I'm on a windows test box with WAMP on it, so it may be 
caused by that. I'll send you tomorrow the changes I done to the code 
anyway; as I said, nothing major.

Chris Hostetter wrote:
> : - What is the loadFactor variable of HashDocSet? Should I optimize it too?
>
> this is the same as the loadFactor in a HashMap constructor -- but i don't
> think it has much affect on performance since the HashDocSets never
> "grow".
>
> I personally have never tuned the loadFactor :)
>
> : - What's the units on the size value of the caches? Megs, number of
> : queries, kilobytes? Not described anywhere.
>
> "entries" ... the number of items allowed in the cache.
>
> : - Any way to programatically change the OR/AND preference of the query
> : parser? I set it to AND by default for user queries, but i'd like to set
> : it to OR for some server-side queries I must do (find related articles,
> : order by score).
>
> you mean using StandardRequestHandler? ... not that i can think of off the
> top of my head, but typicaly it makes sense to just configure what you
> want for your "users" in the schema, and then make any machine generated
> queries be explicit.
>
> : - Whats the difference between the 2 commits type? Blocking and
> : non-blocking. Didn't see any differences at all, tried both.
>
> do you mean the waitFlush and waitSearcher options?
> if either of those is true, you shouldn't get a response back from the
> server untill they have finished.  if they are false, then the server
> should respond instantly even if it takes several seconds (or maybe even
> minutes) to complete the operation (optimizes can take a while in some
> cases -- as can opening newSearchers if you have a lot of cache warming
> configured)
>
> : - Every time I do an <optimize> command, I get the following in my
> : catalina logs - should I do anything about it?
>
> the optimize command needs to be well formed XML, try "<optimize/>"
> instead of just "<optimize>"
>
> : - Any benefits of setting the allowed memory for Tomcat higher? Right
> : now im allocating 384 megs.
>
> the more memory you've got, the more cachng you can support .. but if
> your index changes so frequently compared to the rate of *unique*
> queries you get that your caches never fill up, it may not matter.
>
>
>
>
> -Hoss
>   
-- 
Michael Imbeault
CHUL Research Center (CHUQ)
2705 boul. Laurier
Ste-Foy, QC, Canada, G1V 4G2
Tel: (418) 654-2705, Fax: (418) 654-2212


Re: Got it working! And some questions

Posted by Chris Hostetter <ho...@fucit.org>.
: - What is the loadFactor variable of HashDocSet? Should I optimize it too?

this is the same as the loadFactor in a HashMap constructor -- but i don't
think it has much affect on performance since the HashDocSets never
"grow".

I personally have never tuned the loadFactor :)

: - What's the units on the size value of the caches? Megs, number of
: queries, kilobytes? Not described anywhere.

"entries" ... the number of items allowed in the cache.

: - Any way to programatically change the OR/AND preference of the query
: parser? I set it to AND by default for user queries, but i'd like to set
: it to OR for some server-side queries I must do (find related articles,
: order by score).

you mean using StandardRequestHandler? ... not that i can think of off the
top of my head, but typicaly it makes sense to just configure what you
want for your "users" in the schema, and then make any machine generated
queries be explicit.

: - Whats the difference between the 2 commits type? Blocking and
: non-blocking. Didn't see any differences at all, tried both.

do you mean the waitFlush and waitSearcher options?
if either of those is true, you shouldn't get a response back from the
server untill they have finished.  if they are false, then the server
should respond instantly even if it takes several seconds (or maybe even
minutes) to complete the operation (optimizes can take a while in some
cases -- as can opening newSearchers if you have a lot of cache warming
configured)

: - Every time I do an <optimize> command, I get the following in my
: catalina logs - should I do anything about it?

the optimize command needs to be well formed XML, try "<optimize/>"
instead of just "<optimize>"

: - Any benefits of setting the allowed memory for Tomcat higher? Right
: now im allocating 384 megs.

the more memory you've got, the more cachng you can support .. but if
your index changes so frequently compared to the rate of *unique*
queries you get that your caches never fill up, it may not matter.




-Hoss


RE: Got it working! And some questions

Posted by Brian Lucas <bl...@gmail.com>.
Hi Michael,

I apologize for the lack of testing on the SolPHP.  I had to "strip" it down
significantly to turn it into a general class that would be usable and the
version up there has not been extensively tested yet (I'm almost ready to
get back to that and "revise" it), plus much of my coding is done in Rails
at the moment.  However...

If you have a new version, could you send it over my way or just upload it
to the wiki?  I'd like to take a look at the changes and throw your revised
version up there or integrate both versions into a cleaner revision of the
version already there.

With respect to batch queries, it's already designed to do that (that's why
you see "array($array)" in the example, because it accepts an array of
updates) but I'd definitely like to see how you revised it.

Thanks,
Brian


-----Original Message-----
From: Michael Imbeault [mailto:michael.imbeault@sympatico.ca] 
Sent: Saturday, September 09, 2006 12:30 PM
To: solr-user@lucene.apache.org
Subject: Got it working! And some questions

First of all, in reference to 
http://www.mail-archive.com/solr-user@lucene.apache.org/msg00808.html , 
I got it working! The problem(s) was coming from solPHP; the 
implementation in the wiki isn't really working, to be honest, at least 
for me. I had to modify it significantly at multiple places to get it 
working. Tomcat 5.5, WAMP and Windows XP.

The main problem was that addIndex was sending 1 doc at a time to solr; 
it would cause a problem after a few thousand docs because i was running 
out of resources. I modified solr_update.php to handle batch queries, 
and i'm now sending batches of 1000 docs at a time. Great indexing speed.

Had a slight problem with the curl function of solr_update.php; the 
custom HTTP header wasn't recognized; I now use curl_setopt($ch, 
CURLOPT_POST, 1); curl_setopt($ch, CURLOPT_POSTFIELDS, $post_string); - 
much simpler, and now everything works!

Up so far I indexed 15.000.000 documents (my whole collection, 
basically) and the performance i'm getting is INCREDIBLE (sub 100ms 
query time without warmup and no optimization at all on a 7 gigs index - 
and with the cache, it gets stupid fast)! Seriously, Solr amaze me every 
time I use it. I increased HashDocSet Maxsize to 75000, will continue to 
optimize this value - it helped a great deal. I will try disMaxHandler 
soon too; right now the standard one is great. And I will index with a 
better stopword file; the default one could really use improvements.

Some questions (couldn't find the answer in the docs):

- Is the solr php in the wiki working out of the box for anyone? Else we 
could modify the wiki...

- What is the loadFactor variable of HashDocSet? Should I optimize it too?

- What's the units on the size value of the caches? Megs, number of 
queries, kilobytes? Not described anywhere.

- Any way to programatically change the OR/AND preference of the query 
parser? I set it to AND by default for user queries, but i'd like to set 
it to OR for some server-side queries I must do (find related articles, 
order by score).

- Whats the difference between the 2 commits type? Blocking and 
non-blocking. Didn't see any differences at all, tried both.

- Every time I do an <optimize> command, I get the following in my 
catalina logs - should I do anything about it?

 9-Sep-2006 2:24:40 PM org.apache.solr.core.SolrException log
SEVERE: Exception during commit/optimize:java.io.EOFException: no more 
data available - expected end tag </optimize> to close start tag 
<optimize> from line 1, parser stopped on START_TAG seen <optimize>... @1:10

- Any benefits of setting the allowed memory for Tomcat higher? Right 
now im allocating 384 megs.

Can't wait to try the new Faceted Queries... seriously, solr is really, 
really awesome up so far. Thanks for all your work, and sorry for all 
the questions!

-- 
Michael Imbeault
CHUL Research Center (CHUQ)
2705 boul. Laurier
Ste-Foy, QC, Canada, G1V 4G2
Tel: (418) 654-2705, Fax: (418) 654-2212


Got it working! And some questions

Posted by Michael Imbeault <mi...@sympatico.ca>.
First of all, in reference to 
http://www.mail-archive.com/solr-user@lucene.apache.org/msg00808.html , 
I got it working! The problem(s) was coming from solPHP; the 
implementation in the wiki isn't really working, to be honest, at least 
for me. I had to modify it significantly at multiple places to get it 
working. Tomcat 5.5, WAMP and Windows XP.

The main problem was that addIndex was sending 1 doc at a time to solr; 
it would cause a problem after a few thousand docs because i was running 
out of resources. I modified solr_update.php to handle batch queries, 
and i'm now sending batches of 1000 docs at a time. Great indexing speed.

Had a slight problem with the curl function of solr_update.php; the 
custom HTTP header wasn't recognized; I now use curl_setopt($ch, 
CURLOPT_POST, 1); curl_setopt($ch, CURLOPT_POSTFIELDS, $post_string); - 
much simpler, and now everything works!

Up so far I indexed 15.000.000 documents (my whole collection, 
basically) and the performance i'm getting is INCREDIBLE (sub 100ms 
query time without warmup and no optimization at all on a 7 gigs index - 
and with the cache, it gets stupid fast)! Seriously, Solr amaze me every 
time I use it. I increased HashDocSet Maxsize to 75000, will continue to 
optimize this value - it helped a great deal. I will try disMaxHandler 
soon too; right now the standard one is great. And I will index with a 
better stopword file; the default one could really use improvements.

Some questions (couldn't find the answer in the docs):

- Is the solr php in the wiki working out of the box for anyone? Else we 
could modify the wiki...

- What is the loadFactor variable of HashDocSet? Should I optimize it too?

- What's the units on the size value of the caches? Megs, number of 
queries, kilobytes? Not described anywhere.

- Any way to programatically change the OR/AND preference of the query 
parser? I set it to AND by default for user queries, but i'd like to set 
it to OR for some server-side queries I must do (find related articles, 
order by score).

- Whats the difference between the 2 commits type? Blocking and 
non-blocking. Didn't see any differences at all, tried both.

- Every time I do an <optimize> command, I get the following in my 
catalina logs - should I do anything about it?

 9-Sep-2006 2:24:40 PM org.apache.solr.core.SolrException log
SEVERE: Exception during commit/optimize:java.io.EOFException: no more 
data available - expected end tag </optimize> to close start tag 
<optimize> from line 1, parser stopped on START_TAG seen <optimize>... @1:10

- Any benefits of setting the allowed memory for Tomcat higher? Right 
now im allocating 384 megs.

Can't wait to try the new Faceted Queries... seriously, solr is really, 
really awesome up so far. Thanks for all your work, and sorry for all 
the questions!

-- 
Michael Imbeault
CHUL Research Center (CHUQ)
2705 boul. Laurier
Ste-Foy, QC, Canada, G1V 4G2
Tel: (418) 654-2705, Fax: (418) 654-2212