You are viewing a plain text version of this content. The canonical link for it is here.
Posted to solr-user@lucene.apache.org by Tom Weber <to...@rtl.lu> on 2006/09/06 16:02:09 UTC
Double Solr Installation on Single Tomcat (or Double Index)
Hello,
I need to have a second separate index (separate data) on the same
server.
Is there a possibility to do this in a single solr install on a
tomcat server or do I need to have a second instance in the same
tomcat install ?
If either one is possible, does somebody has some advice how to
set this up, and how to be sure that both indexes do not interact ?
Many thanks for any help,
Best Greetings,
Tom
Re: SolrCore as Singleton?
Posted by Eivind Hasle Amundsen <ei...@ifi.uio.no>.
Tim Archambault wrote:
> In regard to the comment about lack of an interface, I view this as a
> benefit of the tool.
>
> Whether I'm developing with Python, PHP, Coldfusion, .NET, Java, etc.
> I can create my own customizable interface. As a coldfusion programmer
> with moderate programming capabilities, this tool is perfect for my
> needs.
That's good to hear. I never meant that a GUI should replace anything at
all. Did it come out that way?
As the product evolves, it is only natural to add capabilities and
features. Some of these should be available from different interfaces,
including GUI(s). However one should be able to interface with the
application at different levels. When Solr gets more complex over time,
care must be taken so it does not get complicated. There might be
numerous more points of entry into a more complex product. It is
necessary to keep things simple as well as providing centralized
configuration possibilities. Following this philosophy, Solr users will
be able to choose their level of interaction.
(In a metaphor, some people prefer using GNU/Linux just by installing a
distro; others compile and become best friends with the command line.)
Eivind
Re: Re: SolrCore as Singleton?
Posted by Tim Archambault <ta...@bangordailynews.net>.
In regard to the comment about lack of an interface, I view this as a
benefit of the tool.
Whether I'm developing with Python, PHP, Coldfusion, .NET, Java, etc.
I can create my own customizable interface. As a coldfusion programmer
with moderate programming capabilities, this tool is perfect for my
needs.
On 9/8/06, Andrew May <am...@ingenta.com> wrote:
> Chris Hostetter wrote:
> > : Nice. Is the same doable under Jetty? (never had to deal with JNDI
> > : under Jetty)
> >
> > i haven't tried it personally, but according to Yoav "reading" JNDI
> > options is part of hte Servlet Spec, and billa found a refrene to
> > useing "<env-entry>" to do so...
> >
> > http://www.nabble.com/Re%3A-multiple-solr-webapps-p3991310.html
> >
> > ...where exactly that option goes in Jetty's configuration isn't something
> > i'm clear on.
> >
>
> <env-entry> values go in web.xml, so it would mean having modified versions of solr.war
> for each collection.
>
> <env-entry> is an optional part of the Servlet spec for standalone servlet
> implementations. The basic version of Jetty does not have any JNDI support, you need to
> use JettyPlus (http://jetty.mortbay.org/jetty5/plus/index.html) for that.
>
> -Andrew
>
Re: SolrCore as Singleton?
Posted by Andrew May <am...@ingenta.com>.
Chris Hostetter wrote:
> : Nice. Is the same doable under Jetty? (never had to deal with JNDI
> : under Jetty)
>
> i haven't tried it personally, but according to Yoav "reading" JNDI
> options is part of hte Servlet Spec, and billa found a refrene to
> useing "<env-entry>" to do so...
>
> http://www.nabble.com/Re%3A-multiple-solr-webapps-p3991310.html
>
> ...where exactly that option goes in Jetty's configuration isn't something
> i'm clear on.
>
<env-entry> values go in web.xml, so it would mean having modified versions of solr.war
for each collection.
<env-entry> is an optional part of the Servlet spec for standalone servlet
implementations. The basic version of Jetty does not have any JNDI support, you need to
use JettyPlus (http://jetty.mortbay.org/jetty5/plus/index.html) for that.
-Andrew
Re: SolrCore as Singleton?
Posted by Chris Hostetter <ho...@fucit.org>.
: Nice. Is the same doable under Jetty? (never had to deal with JNDI
: under Jetty)
i haven't tried it personally, but according to Yoav "reading" JNDI
options is part of hte Servlet Spec, and billa found a refrene to
useing "<env-entry>" to do so...
http://www.nabble.com/Re%3A-multiple-solr-webapps-p3991310.html
...where exactly that option goes in Jetty's configuration isn't something
i'm clear on.
: ----- Original Message ----
: From: Chris Hostetter <ho...@fucit.org>
: To: solr-user@lucene.apache.org
: Sent: Friday, September 8, 2006 1:46:19 AM
: Subject: Re: SolrCore as Singleton?
:
:
: : I am currently in the startup phase of my thesis regarding open source
: : and enterprise search. After having worked at perhaps the leading major
: : enterprise search company, I have the impression that multiple
: : collections is a very common feature (and very sought-after). It is a
: : trend I see not just directly from my work, but most certainly also as a
: : result of enterprise search solutions becoming more common in general.
:
: SolrCore being a singleton doesn't prevent you from having multiple
: collections per JVM -- you just need to run multiple instances of the
: webapp within a single servlet container using JNDI to specify the
: seperate solr.home directories, specifics for doing this in Tomcat are on
: the wiki...
: http://wiki.apache.org/solr/SolrTomcat
:
: : Until this framework is available with its appropriate configuration
: : files, administrator interface and so on in place, it seems a bit
: : unnatural to support multiple collections from the same application
: : instance.
: :
: : Bottom line (for now): I think that users looking for enterprise search
: : solutions must have a simple way of creating multiple collections from
: : within the same application.
:
: Well it's pretty easy right now to make a new collection -- it's
: just two new files (solrconfig.xml and schema.xml)
:
:
:
:
: -Hoss
:
:
:
:
-Hoss
Re: SolrCore as Singleton?
Posted by Otis Gospodnetic <ot...@yahoo.com>.
Nice. Is the same doable under Jetty? (never had to deal with JNDI under Jetty)
Otis
----- Original Message ----
From: Chris Hostetter <ho...@fucit.org>
To: solr-user@lucene.apache.org
Sent: Friday, September 8, 2006 1:46:19 AM
Subject: Re: SolrCore as Singleton?
: I am currently in the startup phase of my thesis regarding open source
: and enterprise search. After having worked at perhaps the leading major
: enterprise search company, I have the impression that multiple
: collections is a very common feature (and very sought-after). It is a
: trend I see not just directly from my work, but most certainly also as a
: result of enterprise search solutions becoming more common in general.
SolrCore being a singleton doesn't prevent you from having multiple
collections per JVM -- you just need to run multiple instances of the
webapp within a single servlet container using JNDI to specify the
seperate solr.home directories, specifics for doing this in Tomcat are on
the wiki...
http://wiki.apache.org/solr/SolrTomcat
: Until this framework is available with its appropriate configuration
: files, administrator interface and so on in place, it seems a bit
: unnatural to support multiple collections from the same application
: instance.
:
: Bottom line (for now): I think that users looking for enterprise search
: solutions must have a simple way of creating multiple collections from
: within the same application.
Well it's pretty easy right now to make a new collection -- it's
just two new files (solrconfig.xml and schema.xml)
-Hoss
Re: SolrCore as Singleton?
Posted by Chris Hostetter <ho...@fucit.org>.
: I am currently in the startup phase of my thesis regarding open source
: and enterprise search. After having worked at perhaps the leading major
: enterprise search company, I have the impression that multiple
: collections is a very common feature (and very sought-after). It is a
: trend I see not just directly from my work, but most certainly also as a
: result of enterprise search solutions becoming more common in general.
SolrCore being a singleton doesn't prevent you from having multiple
collections per JVM -- you just need to run multiple instances of the
webapp within a single servlet container using JNDI to specify the
seperate solr.home directories, specifics for doing this in Tomcat are on
the wiki...
http://wiki.apache.org/solr/SolrTomcat
: Until this framework is available with its appropriate configuration
: files, administrator interface and so on in place, it seems a bit
: unnatural to support multiple collections from the same application
: instance.
:
: Bottom line (for now): I think that users looking for enterprise search
: solutions must have a simple way of creating multiple collections from
: within the same application.
Well it's pretty easy right now to make a new collection -- it's
just two new files (solrconfig.xml and schema.xml)
-Hoss
Re: SolrCore as Singleton?
Posted by Eivind Hasle Amundsen <ei...@ifi.uio.no>.
Chris Hostetter wrote:
> I'm going to sidestep the issue of wether there *was* a good reason for
> it, as well as the "does the singleton pattern make sense for the current
> usage" question and answer what i think is an equally significant
> question: "what are the implications of trying to change it now?" ... the
> biggest i can think of being that SolrConfig is also a static singleton,
> and a *lot* of code in the Solr code base would need to be changed to
> support multiple SolrConfigs ... and without multiple SolrConfigs, there
> really isnt' any reason to have multiple SolrCores.
This actually underlines that my guess was right to a certain extent.
Changing from singleton is not straightforward.
I am currently in the startup phase of my thesis regarding open source
and enterprise search. After having worked at perhaps the leading major
enterprise search company, I have the impression that multiple
collections is a very common feature (and very sought-after). It is a
trend I see not just directly from my work, but most certainly also as a
result of enterprise search solutions becoming more common in general.
However I must say that Solr seems to be approaching the problem from a
very logical angle. What really is missing is a more abstract layer,
call it application framework, that probably will come afterwards
anyway. This will perhaps evolve naturally as part of the Solr project
at a later stage, or perhaps even as a separate open source project
building on Solr.
Until this framework is available with its appropriate configuration
files, administrator interface and so on in place, it seems a bit
unnatural to support multiple collections from the same application
instance.
Bottom line (for now): I think that users looking for enterprise search
solutions must have a simple way of creating multiple collections from
within the same application.
I apologize for my very philosophical e-mail, but I tend to become
somewhat visionary and conceptual after a few beers, and this might not
be the perfect forum for these discussions(?) :)
Eivind
Re: SolrCore as Singleton?
Posted by Chris Hostetter <ho...@fucit.org>.
: Is there a good reason for implementing SolrCore as a Singleton?
I'm going to sidestep the issue of wether there *was* a good reason for
it, as well as the "does the singleton pattern make sense for the current
usage" question and answer what i think is an equally significant
question: "what are the implications of trying to change it now?" ... the
biggest i can think of being that SolrConfig is also a static singleton,
and a *lot* of code in the Solr code base would need to be changed to
support multiple SolrConfigs ... and without multiple SolrConfigs, there
really isnt' any reason to have multiple SolrCores.
-Hoss
Re: SolrCore as Singleton?
Posted by Eivind Hasle Amundsen <ei...@ifi.uio.no>.
> If there is no specific reason for making it a Singleton, I'd vote for
> removing this so that the
> SolrCore(dataDir, schema) constructor could be used to instantiate
> multiple cores.
I agree with your arguments. However (although being new to Solr) there
is more than one way to do it, I think.
To be more specific it seems that using several different indexes with
individual datadirs and schemas is very useful, based on my impression
that many enterprise users seem to want this functionality. It is not
difficult to imagine such a usage pattern or implementation, in its
abstract sense, of Solr for almost all uses.
However (and this is where most you guys should fill me in), it could be
wasteful to run multiple complete instances. Could information be shared
in some way between the instances to save on resources? Perhaps what I
am really trying to say here, is that we have to look at the whole model
when considering how to implement better support for the desired usage
pattern outlined above.
Eivind
SolrCore as Singleton?
Posted by Joachim Martin <jm...@path-works.com>.
Is there a good reason for implementing SolrCore as a Singleton?
We are experimenting with running Solr as a Spring service embedded in
our app. Since it is a Singleton
we cannot have more than one index (not currently a problem, but could be).
I note the comment:
// Singleton for now...
If there is no specific reason for making it a Singleton, I'd vote for
removing this so that the
SolrCore(dataDir, schema) constructor could be used to instantiate
multiple cores.
Seems to me that since the primary usage scenario of solr is access via
REST (i.e. no Solr jar/API),
the Singleton pattern is not necessary here.
--Joachim
Re: Double Solr Installation on Single Tomcat (or Double Index)
Posted by Przemysław Brzozowski <br...@interia.pl>.
Tom Weber napisał(a):
> Hello,
>
> I need to have a second separate index (separate data) on the same
> server.
>
> Is there a possibility to do this in a single solr install on a
> tomcat server or do I need to have a second instance in the same
> tomcat install ?
>
You will need separate instances within the same Tomcat.
> If either one is possible, does somebody has some advice how to set
> this up, and how to be sure that both indexes do not interact ?
>
Create context xml file for each SOLR application in the folder :
CATALINA_HOME\conf\Catalina\localhost\context_name.xml.
<Context docBase="${catalina.home}/..../solr.war" debug="0"
crossContext="true">
<Environment name="solr/home" type="java.lang.String"
value="${catalina.home}\solr_data_files\" override="true" />
</Context>
Adjust docBase to point at solr.war.
Adjust solr/home to point at solr_data_files - different folders for
different SOLR instances.
If the context xml file name is called solr1.xml then you can acces that
solr instance using following url http://host:port/solr1/admin.
> Many thanks for any help,
>
> Best Greetings,
>
> Tom
>
>
----------------------------------------------------------------------
200 zlotych bonusu w eliminacjach do Mistrzostw Europy dla Ciebie!
BETWAY.com >>> http://link.interia.pl/f199f
Re: Double Solr Installation on Single Tomcat (or Double Index)
Posted by Yonik Seeley <yo...@apache.org>.
Another way to run multiple solr webapps with Tomcat involves context
fragments. It allows you to use a single copy of the solr.war but
specify different configs (via different solrhomes).
http://wiki.apache.org/solr/SolrTomcat
-Yonik
On 9/6/06, sangraal aiken <sa...@gmail.com> wrote:
> I've set up 2 separate Solr indexes on one Tomcat instance. I basically
> created two separate Solr webapps. I have one webapp that is the client to
> both Solr instances as well. So the whole setup is 3 webapps.
>
> I have one set of Solr source classes and an ant task to build a jar file
> and copy it into the lib directory of both Solr webapps. This way if you
> customize your Solr installs you only have to do it once. Each Solr webapp
> obviously needs it's own solr config and data directories which is
> configurable through solrConfig. Both indexes are completely separate and
> configurable independently through these config files.
>
> If you need more detail let me know, I'll try to help you out.
>
> -S
>
> On 9/6/06, Tom Weber <to...@rtl.lu> wrote:
> >
> > Hello,
> >
> > I need to have a second separate index (separate data) on the same
> > server.
> >
> > Is there a possibility to do this in a single solr install on a
> > tomcat server or do I need to have a second instance in the same
> > tomcat install ?
> >
> > If either one is possible, does somebody has some advice how to
> > set this up, and how to be sure that both indexes do not interact ?
> >
> > Many thanks for any help,
> >
> > Best Greetings,
> >
> > Tom
> >
Re: Double Solr Installation on Single Tomcat (or Double Index)
Posted by sangraal aiken <sa...@gmail.com>.
I've set up 2 separate Solr indexes on one Tomcat instance. I basically
created two separate Solr webapps. I have one webapp that is the client to
both Solr instances as well. So the whole setup is 3 webapps.
I have one set of Solr source classes and an ant task to build a jar file
and copy it into the lib directory of both Solr webapps. This way if you
customize your Solr installs you only have to do it once. Each Solr webapp
obviously needs it's own solr config and data directories which is
configurable through solrConfig. Both indexes are completely separate and
configurable independently through these config files.
If you need more detail let me know, I'll try to help you out.
-S
On 9/6/06, Tom Weber <to...@rtl.lu> wrote:
>
> Hello,
>
> I need to have a second separate index (separate data) on the same
> server.
>
> Is there a possibility to do this in a single solr install on a
> tomcat server or do I need to have a second instance in the same
> tomcat install ?
>
> If either one is possible, does somebody has some advice how to
> set this up, and how to be sure that both indexes do not interact ?
>
> Many thanks for any help,
>
> Best Greetings,
>
> Tom
>
Re: Got it working! And some questions
Posted by James liu <li...@gmail.com>.
- Is the solr php in the wiki working out of the box for anyone?
show your php.ini. did you performance your php?
2006/9/10, Brian Lucas <bl...@gmail.com>:
>
> Hi Michael,
>
> I apologize for the lack of testing on the SolPHP. I had to "strip" it
> down
> significantly to turn it into a general class that would be usable and the
> version up there has not been extensively tested yet (I'm almost ready to
> get back to that and "revise" it), plus much of my coding is done in Rails
> at the moment. However...
>
> If you have a new version, could you send it over my way or just upload it
> to the wiki? I'd like to take a look at the changes and throw your
> revised
> version up there or integrate both versions into a cleaner revision of the
> version already there.
>
> With respect to batch queries, it's already designed to do that (that's
> why
> you see "array($array)" in the example, because it accepts an array of
> updates) but I'd definitely like to see how you revised it.
>
> Thanks,
> Brian
>
>
> -----Original Message-----
> From: Michael Imbeault [mailto:michael.imbeault@sympatico.ca]
> Sent: Saturday, September 09, 2006 12:30 PM
> To: solr-user@lucene.apache.org
> Subject: Got it working! And some questions
>
> First of all, in reference to
> http://www.mail-archive.com/solr-user@lucene.apache.org/msg00808.html ,
> I got it working! The problem(s) was coming from solPHP; the
> implementation in the wiki isn't really working, to be honest, at least
> for me. I had to modify it significantly at multiple places to get it
> working. Tomcat 5.5, WAMP and Windows XP.
>
> The main problem was that addIndex was sending 1 doc at a time to solr;
> it would cause a problem after a few thousand docs because i was running
> out of resources. I modified solr_update.php to handle batch queries,
> and i'm now sending batches of 1000 docs at a time. Great indexing speed.
>
> Had a slight problem with the curl function of solr_update.php; the
> custom HTTP header wasn't recognized; I now use curl_setopt($ch,
> CURLOPT_POST, 1); curl_setopt($ch, CURLOPT_POSTFIELDS, $post_string); -
> much simpler, and now everything works!
>
> Up so far I indexed 15.000.000 documents (my whole collection,
> basically) and the performance i'm getting is INCREDIBLE (sub 100ms
> query time without warmup and no optimization at all on a 7 gigs index -
> and with the cache, it gets stupid fast)! Seriously, Solr amaze me every
> time I use it. I increased HashDocSet Maxsize to 75000, will continue to
> optimize this value - it helped a great deal. I will try disMaxHandler
> soon too; right now the standard one is great. And I will index with a
> better stopword file; the default one could really use improvements.
>
> Some questions (couldn't find the answer in the docs):
>
> - Is the solr php in the wiki working out of the box for anyone? Else we
> could modify the wiki...
>
> - What is the loadFactor variable of HashDocSet? Should I optimize it too?
>
> - What's the units on the size value of the caches? Megs, number of
> queries, kilobytes? Not described anywhere.
>
> - Any way to programatically change the OR/AND preference of the query
> parser? I set it to AND by default for user queries, but i'd like to set
> it to OR for some server-side queries I must do (find related articles,
> order by score).
>
> - Whats the difference between the 2 commits type? Blocking and
> non-blocking. Didn't see any differences at all, tried both.
>
> - Every time I do an <optimize> command, I get the following in my
> catalina logs - should I do anything about it?
>
> 9-Sep-2006 2:24:40 PM org.apache.solr.core.SolrException log
> SEVERE: Exception during commit/optimize:java.io.EOFException: no more
> data available - expected end tag </optimize> to close start tag
> <optimize> from line 1, parser stopped on START_TAG seen <optimize>...
> @1:10
>
> - Any benefits of setting the allowed memory for Tomcat higher? Right
> now im allocating 384 megs.
>
> Can't wait to try the new Faceted Queries... seriously, solr is really,
> really awesome up so far. Thanks for all your work, and sorry for all
> the questions!
>
> --
> Michael Imbeault
> CHUL Research Center (CHUQ)
> 2705 boul. Laurier
> Ste-Foy, QC, Canada, G1V 4G2
> Tel: (418) 654-2705, Fax: (418) 654-2212
>
>
Re: Got it working! And some questions
Posted by Yonik Seeley <yo...@apache.org>.
On 9/9/06, Michael Imbeault <mi...@sympatico.ca> wrote:
> The main problem was that addIndex was sending 1 doc at a time to solr;
> it would cause a problem after a few thousand docs because i was running
> out of resources.
Sending one doc at a time should be fine... you shouldn't run out of
resources.
There must be a bug somewhere...
-Yonik
Re: Got it working! And some questions
Posted by Chris Hostetter <ho...@fucit.org>.
: First of all, it seems the mailing list is having some troubles? Some of
: my posts end up in the wrong thread (even new threads I post), I don't
: receive them in my mail, and they're present only in the 'date archive'
: of http://www.mail-archive.com, and not in the 'thread' one? I don't
: receive some of the other peoples post in my mail too, problems started
: last week I think.
i haven't noticed any problems with mail not making it through - some mail
clients (gmail for example) seem to supress messages they can tell you
sent, maybe that'swhat's happening on your end? As for
threads you start not showing up on the "thread" list ... according to
my mailbox, all but one message i've recieved from you included a
"References:" header (if not a In-Reply-To header) which causes some mail
archivers to assume it's part of an existing thread (this thread for
instance is considered part of the "Double Solr Installation on Single
Tomcat (or Double Index)" thread) ... you may wnat to experiement with
your mail client (off list) to see if you can figure out when/why this
happening.
: Secondly, Chris, thanks for all the useful answers, everything is much
: clearer now. This info should be added to the wiki I think; should I do
feel free ... that's why it's a wiki.
: it? I'm still a little disappointed that I can't change the OR/AND
: parsing by just changing some parameter (like I can do for the number of
: results returned, for example); adding a OR between each word in the
: text i want to compare sounds suboptimal, but i'll probably do it that
: way; its a very minor nitpick, solr is awesome, as I said before.
it would be a fairly simple option to add just like changing the
default field (patches welcome!) but as i said -- typcially if you don't
want the default behavior you are programaticaly generating the query
anyway, and already adding some markup, a little more doesn't make it less
optimal.
-Hoss
Re: Got it working! And some questions
Posted by Chris Hostetter <ho...@fucit.org>.
: Maybe something like q.op or q.oper if it *only* applies to q. Which
: begs the question... what *does* it apply to? At first blush, it
: doesn't seem like it should apply to other queries like fq, facet
: queries, and esp queries defined in solrconfig.xml. I think that
: would be very surprising.
agreed not the comment i put into SolrPluginUtils.parseFilterQueries when
i add fq support to StandardRequestHandler...
/* Ignore SolrParams.DF - could have init param FQs assuming the
* schema default with query param DF intented to only affect Q.
* If user doesn't want schema default, they should be explicit in the FQ.
*/
... i would think a "do" or "op" or "q.op" param should *definitely* only
influence the "q" param.
-Hoss
Re: Got it working! And some questions
Posted by Chris Hostetter <ho...@fucit.org>.
: SolrQueryParser now knows nothing about the default operator, it is
: set from QueryParsing.parseQuery() when passed a SolrParams.
i didn't test it, but it looks clean to me.
the only other thing i would do is beaf up the javadocs for
SolrQueryParser (to clarify that IndexSchema is only used for determining
field format) and QueryParsing.parseQuery (to clarify that it *does* use
IndexSearcher to get extra parsing options).
: QueryParsing.parseQuery() methods could be simplified, perhaps even
...
: It could even get the "q" parameter from there, but there is code
: that passes expressions that don't come from "q". Maybe we could
...yeha, it's utility for simple queries regardless of the "primary"
language of a request handler is key.
: have two parseQuery() methods: parseQuery(String expression,
: SolrQueryRequest req) and parseQuery(SolrQueryRequest req), and for
: the latter the "q" parameter is pulled from the request and used as
: the expression.
That sounds good to me ... but it doesn't seem critical ... clean house as
much as you want, but i don't think anybody else will mind a bit of dust
on the window sills.
-Hoss
Re: Got it working! And some questions
Posted by Erik Hatcher <er...@ehatchersolutions.com>.
On Sep 12, 2006, at 4:47 PM, Chris Hostetter wrote:
> : I've implemented the ability to override the default operator with
> : q.op=AND|OR. The patch is pasted below for your review.
>
> if i'm reading that right, one subtlety is that "new
> SolrQueryParser(schema,field)" no longer pas attention to
> schema.getQueryParserDefaultOperator() -- that only only becomes
> applicable when using QueryParsing.parseQuery
>
> ...i am very okay with this change, i wasn't really a fan of the
> fact that
> the SolrQueryParser pulled that info out of the IndexSchema in it's
> constructor previously, i just wanted to point out that this patch
> would
> change that.
>
> Perhaps the constructor for SolrQueryParser shouldn't be aware of
> the op
> at all (either from the schema or from the SolrParams) -- and
> setting it
> should be left to QueryParsing.parseQuery (or some other utility in
> the
> QueryParsing class) ... personally i'm a fan of leaving
> SolrQueryParser as
> much like QueryParser as possible -- with the only real change
> being the
> knowledege of hte individual field formats.
I've reworked it based on your feedback. The patch is pasted below.
SolrQueryParser now knows nothing about the default operator, it is
set from QueryParsing.parseQuery() when passed a SolrParams.
QueryParsing.parseQuery() methods could be simplified, perhaps even
into a single method, that took a query expression and a
SolrQueryRequest, where it can get the SolrParams and IndexSchema.
It could even get the "q" parameter from there, but there is code
that passes expressions that don't come from "q". Maybe we could
have two parseQuery() methods: parseQuery(String expression,
SolrQueryRequest req) and parseQuery(SolrQueryRequest req), and for
the latter the "q" parameter is pulled from the request and used as
the expression.
As it is, the patch below works fine and I'm happy to commit it, but
am happy to rework this sort of thing to get it as clean as others like.
Erik
Index: src/java/org/apache/solr/search/SolrQueryParser.java
===================================================================
--- src/java/org/apache/solr/search/SolrQueryParser.java (revision
442772)
+++ src/java/org/apache/solr/search/SolrQueryParser.java (working copy)
@@ -37,7 +37,6 @@
super(defaultField == null ? schema.getDefaultSearchFieldName
() : defaultField, schema.getQueryAnalyzer());
this.schema = schema;
setLowercaseExpandedTerms(false);
- setDefaultOperator("AND".equals
(schema.getQueryParserDefaultOperator()) ? QueryParser.Operator.AND :
QueryParser.Operator.OR);
}
protected Query getFieldQuery(String field, String queryText)
throws ParseException {
Index: src/java/org/apache/solr/search/QueryParsing.java
===================================================================
--- src/java/org/apache/solr/search/QueryParsing.java (revision 442772)
+++ src/java/org/apache/solr/search/QueryParsing.java (working copy)
@@ -19,6 +19,7 @@
import org.apache.lucene.search.*;
import org.apache.solr.search.function.*;
import org.apache.lucene.queryParser.ParseException;
+import org.apache.lucene.queryParser.QueryParser;
import org.apache.lucene.document.Field;
import org.apache.lucene.index.Term;
import org.apache.solr.core.SolrCore;
@@ -26,6 +27,7 @@
import org.apache.solr.schema.IndexSchema;
import org.apache.solr.schema.SchemaField;
import org.apache.solr.schema.FieldType;
+import org.apache.solr.request.SolrParams;
import java.util.ArrayList;
import java.util.regex.Pattern;
@@ -37,6 +39,7 @@
* @version $Id$
*/
public class QueryParsing {
+ public static final String OP = "q.op";
public static Query parseQuery(String qs, IndexSchema schema) {
return parseQuery(qs, null, schema);
@@ -58,8 +61,26 @@
}
}
+ public static Query parseQuery(String qs, String defaultField,
SolrParams params, IndexSchema schema) {
+ try {
+ String opParam = params.get(OP,
schema.getQueryParserDefaultOperator());
+ QueryParser.Operator defaultOperator = "AND".equals(opParam) ?
QueryParser.Operator.AND : QueryParser.Operator.OR;
+ SolrQueryParser parser = new SolrQueryParser(schema,
defaultField);
+ parser.setDefaultOperator(defaultOperator);
+ Query query = parser.parse(qs);
+ if (SolrCore.log.isLoggable(Level.FINEST)) {
+ SolrCore.log.finest("After QueryParser:" + query);
+ }
+ return query;
+
+ } catch (ParseException e) {
+ SolrCore.log(e);
+ throw new SolrException(400,"Error parsing Lucene query",e);
+ }
+ }
+
/***
* SortSpec encapsulates a Lucene Sort and a count of the number
of documents
* to return.
Index: src/java/org/apache/solr/request/StandardRequestHandler.java
===================================================================
--- src/java/org/apache/solr/request/StandardRequestHandler.java
(revision 442772)
+++ src/java/org/apache/solr/request/StandardRequestHandler.java
(working copy)
@@ -105,7 +105,7 @@
List<String> commands = StrUtils.splitSmart(sreq,';');
String qs = commands.size() >= 1 ? commands.get(0) : "";
- Query query = QueryParsing.parseQuery(qs, defaultField,
req.getSchema());
+ Query query = QueryParsing.parseQuery(qs, defaultField, p,
req.getSchema());
// If the first non-query, non-filter command is a simple
sort on an indexed field, then
// we can use the Lucene sort ability.
Re: Got it working! And some questions
Posted by Chris Hostetter <ho...@fucit.org>.
: I've implemented the ability to override the default operator with
: q.op=AND|OR. The patch is pasted below for your review.
if i'm reading that right, one subtlety is that "new
SolrQueryParser(schema,field)" no longer pas attention to
schema.getQueryParserDefaultOperator() -- that only only becomes
applicable when using QueryParsing.parseQuery
...i am very okay with this change, i wasn't really a fan of the fact that
the SolrQueryParser pulled that info out of the IndexSchema in it's
constructor previously, i just wanted to point out that this patch would
change that.
Perhaps the constructor for SolrQueryParser shouldn't be aware of the op
at all (either from the schema or from the SolrParams) -- and setting it
should be left to QueryParsing.parseQuery (or some other utility in the
QueryParsing class) ... personally i'm a fan of leaving SolrQueryParser as
much like QueryParser as possible -- with the only real change being the
knowledege of hte individual field formats.
: Index: src/java/org/apache/solr/search/SolrQueryParser.java
: ===================================================================
: --- src/java/org/apache/solr/search/SolrQueryParser.java (revision
: 442689)
: +++ src/java/org/apache/solr/search/SolrQueryParser.java (working copy)
: @@ -34,10 +34,14 @@
: protected final IndexSchema schema;
: public SolrQueryParser(IndexSchema schema, String defaultField) {
: + this(schema, defaultField, QueryParser.Operator.OR);
: + }
: +
: + public SolrQueryParser(IndexSchema schema, String defaultField,
: QueryParser.Operator defaultOperator) {
: super(defaultField == null ? schema.getDefaultSearchFieldName
: () : defaultField, schema.getQueryAnalyzer());
: this.schema = schema;
: setLowercaseExpandedTerms(false);
: - setDefaultOperator("AND".equals
: (schema.getQueryParserDefaultOperator()) ? QueryParser.Operator.AND :
: QueryParser.Operator.OR);
: + setDefaultOperator(defaultOperator);
: }
: protected Query getFieldQuery(String field, String queryText)
: throws ParseException {
: Index: src/java/org/apache/solr/search/QueryParsing.java
: ===================================================================
: --- src/java/org/apache/solr/search/QueryParsing.java (revision 442689)
: +++ src/java/org/apache/solr/search/QueryParsing.java (working copy)
: @@ -19,6 +19,7 @@
: import org.apache.lucene.search.*;
: import org.apache.solr.search.function.*;
: import org.apache.lucene.queryParser.ParseException;
: +import org.apache.lucene.queryParser.QueryParser;
: import org.apache.lucene.document.Field;
: import org.apache.lucene.index.Term;
: import org.apache.solr.core.SolrCore;
: @@ -26,6 +27,7 @@
: import org.apache.solr.schema.IndexSchema;
: import org.apache.solr.schema.SchemaField;
: import org.apache.solr.schema.FieldType;
: +import org.apache.solr.request.SolrParams;
: import java.util.ArrayList;
: import java.util.regex.Pattern;
: @@ -37,6 +39,7 @@
: * @version $Id$
: */
: public class QueryParsing {
: + public static final String OP = "q.op";
: public static Query parseQuery(String qs, IndexSchema schema) {
: return parseQuery(qs, null, schema);
: @@ -58,8 +61,24 @@
: }
: }
: + public static Query parseQuery(String qs, String defaultField,
: SolrParams params, IndexSchema schema) {
: + try {
: + String opParam = params.get(OP,
: schema.getQueryParserDefaultOperator());
: + QueryParser.Operator defaultOperator = "AND".equals(opParam) ?
: QueryParser.Operator.AND : QueryParser.Operator.OR;
: + Query query = new SolrQueryParser(schema, defaultField,
: defaultOperator).parse(qs);
: + if (SolrCore.log.isLoggable(Level.FINEST)) {
: + SolrCore.log.finest("After QueryParser:" + query);
: + }
: + return query;
: +
: + } catch (ParseException e) {
: + SolrCore.log(e);
: + throw new SolrException(400,"Error parsing Lucene query",e);
: + }
: + }
: +
: /***
: * SortSpec encapsulates a Lucene Sort and a count of the number
: of documents
: * to return.
: Index: src/java/org/apache/solr/request/StandardRequestHandler.java
: ===================================================================
: --- src/java/org/apache/solr/request/StandardRequestHandler.java
: (revision 442689)
: +++ src/java/org/apache/solr/request/StandardRequestHandler.java
: (working copy)
: @@ -94,7 +94,7 @@
: List<String> commands = StrUtils.splitSmart(sreq,';');
: String qs = commands.size() >= 1 ? commands.get(0) : "";
: - Query query = QueryParsing.parseQuery(qs, defaultField,
: req.getSchema());
: + Query query = QueryParsing.parseQuery(qs, defaultField, p,
: req.getSchema());
: // If the first non-query, non-filter command is a simple
: sort on an indexed field, then
: // we can use the Lucene sort ability.
:
-Hoss
Re: Got it working! And some questions
Posted by Erik Hatcher <er...@ehatchersolutions.com>.
On Sep 11, 2006, at 2:52 PM, Yonik Seeley wrote:
> On 9/11/06, Erik Hatcher <er...@ehatchersolutions.com> wrote:
>>
>> On Sep 10, 2006, at 10:47 PM, Michael Imbeault wrote:
>> > I'm still a little disappointed that I can't change the OR/AND
>> > parsing by just changing some parameter (like I can do for the
>> > number of results returned, for example); adding a OR between each
>> > word in the text i want to compare sounds suboptimal, but i'll
>> > probably do it that way; its a very minor nitpick, solr is awesome,
>> > as I said before.
>>
>> I'm the one that added support for controlling the default operator
>> of Solr's query parser, and I hadn't considered the use case of
>> controlling that setting from a request parameter. It should be easy
>> enough to add. I'll take a look at adding that support and commit it
>> once I have it working.
>>
>> What parameter name should be used for this? do=[AND|OR] (for
>> default operator)? We have df for default field.
>
> Maybe something like q.op or q.oper if it *only* applies to q. Which
> begs the question... what *does* it apply to? At first blush, it
> doesn't seem like it should apply to other queries like fq, facet
> queries, and esp queries defined in solrconfig.xml. I think that
> would be very surprising.
I've implemented the ability to override the default operator with
q.op=AND|OR. The patch is pasted below for your review.
The one thing I don't like is that QueryParsing.parseQuery(String qs,
String defaultField, SolrParams params, IndexSchema schema) is a bit
redundant in that it takes defaultField which can also be gleaned
from params, but StandardRequestHandler uses "df" for highlighting also.
I'm happy to commit this if there are no objections or suggestions
for improvement (and of course update the wiki documentation for the
parameters).
Erik
Index: src/java/org/apache/solr/search/SolrQueryParser.java
===================================================================
--- src/java/org/apache/solr/search/SolrQueryParser.java (revision
442689)
+++ src/java/org/apache/solr/search/SolrQueryParser.java (working copy)
@@ -34,10 +34,14 @@
protected final IndexSchema schema;
public SolrQueryParser(IndexSchema schema, String defaultField) {
+ this(schema, defaultField, QueryParser.Operator.OR);
+ }
+
+ public SolrQueryParser(IndexSchema schema, String defaultField,
QueryParser.Operator defaultOperator) {
super(defaultField == null ? schema.getDefaultSearchFieldName
() : defaultField, schema.getQueryAnalyzer());
this.schema = schema;
setLowercaseExpandedTerms(false);
- setDefaultOperator("AND".equals
(schema.getQueryParserDefaultOperator()) ? QueryParser.Operator.AND :
QueryParser.Operator.OR);
+ setDefaultOperator(defaultOperator);
}
protected Query getFieldQuery(String field, String queryText)
throws ParseException {
Index: src/java/org/apache/solr/search/QueryParsing.java
===================================================================
--- src/java/org/apache/solr/search/QueryParsing.java (revision 442689)
+++ src/java/org/apache/solr/search/QueryParsing.java (working copy)
@@ -19,6 +19,7 @@
import org.apache.lucene.search.*;
import org.apache.solr.search.function.*;
import org.apache.lucene.queryParser.ParseException;
+import org.apache.lucene.queryParser.QueryParser;
import org.apache.lucene.document.Field;
import org.apache.lucene.index.Term;
import org.apache.solr.core.SolrCore;
@@ -26,6 +27,7 @@
import org.apache.solr.schema.IndexSchema;
import org.apache.solr.schema.SchemaField;
import org.apache.solr.schema.FieldType;
+import org.apache.solr.request.SolrParams;
import java.util.ArrayList;
import java.util.regex.Pattern;
@@ -37,6 +39,7 @@
* @version $Id$
*/
public class QueryParsing {
+ public static final String OP = "q.op";
public static Query parseQuery(String qs, IndexSchema schema) {
return parseQuery(qs, null, schema);
@@ -58,8 +61,24 @@
}
}
+ public static Query parseQuery(String qs, String defaultField,
SolrParams params, IndexSchema schema) {
+ try {
+ String opParam = params.get(OP,
schema.getQueryParserDefaultOperator());
+ QueryParser.Operator defaultOperator = "AND".equals(opParam) ?
QueryParser.Operator.AND : QueryParser.Operator.OR;
+ Query query = new SolrQueryParser(schema, defaultField,
defaultOperator).parse(qs);
+ if (SolrCore.log.isLoggable(Level.FINEST)) {
+ SolrCore.log.finest("After QueryParser:" + query);
+ }
+ return query;
+
+ } catch (ParseException e) {
+ SolrCore.log(e);
+ throw new SolrException(400,"Error parsing Lucene query",e);
+ }
+ }
+
/***
* SortSpec encapsulates a Lucene Sort and a count of the number
of documents
* to return.
Index: src/java/org/apache/solr/request/StandardRequestHandler.java
===================================================================
--- src/java/org/apache/solr/request/StandardRequestHandler.java
(revision 442689)
+++ src/java/org/apache/solr/request/StandardRequestHandler.java
(working copy)
@@ -94,7 +94,7 @@
List<String> commands = StrUtils.splitSmart(sreq,';');
String qs = commands.size() >= 1 ? commands.get(0) : "";
- Query query = QueryParsing.parseQuery(qs, defaultField,
req.getSchema());
+ Query query = QueryParsing.parseQuery(qs, defaultField, p,
req.getSchema());
// If the first non-query, non-filter command is a simple
sort on an indexed field, then
// we can use the Lucene sort ability.
Re: Got it working! And some questions
Posted by Yonik Seeley <yo...@apache.org>.
On 9/11/06, Erik Hatcher <er...@ehatchersolutions.com> wrote:
>
> On Sep 10, 2006, at 10:47 PM, Michael Imbeault wrote:
> > I'm still a little disappointed that I can't change the OR/AND
> > parsing by just changing some parameter (like I can do for the
> > number of results returned, for example); adding a OR between each
> > word in the text i want to compare sounds suboptimal, but i'll
> > probably do it that way; its a very minor nitpick, solr is awesome,
> > as I said before.
>
> I'm the one that added support for controlling the default operator
> of Solr's query parser, and I hadn't considered the use case of
> controlling that setting from a request parameter. It should be easy
> enough to add. I'll take a look at adding that support and commit it
> once I have it working.
>
> What parameter name should be used for this? do=[AND|OR] (for
> default operator)? We have df for default field.
Maybe something like q.op or q.oper if it *only* applies to q. Which
begs the question... what *does* it apply to? At first blush, it
doesn't seem like it should apply to other queries like fq, facet
queries, and esp queries defined in solrconfig.xml. I think that
would be very surprising.
-Yonik
Re: Got it working! And some questions
Posted by Michael Imbeault <mi...@sympatico.ca>.
Hello Erik,
Thanks for add that feature! "do" is fine with me, if "op" is already
used (not sure about this one).
Erik Hatcher wrote:
>
> On Sep 10, 2006, at 10:47 PM, Michael Imbeault wrote:
>> I'm still a little disappointed that I can't change the OR/AND
>> parsing by just changing some parameter (like I can do for the number
>> of results returned, for example); adding a OR between each word in
>> the text i want to compare sounds suboptimal, but i'll probably do it
>> that way; its a very minor nitpick, solr is awesome, as I said before.
>
> I'm the one that added support for controlling the default operator of
> Solr's query parser, and I hadn't considered the use case of
> controlling that setting from a request parameter. It should be easy
> enough to add. I'll take a look at adding that support and commit it
> once I have it working.
>
> What parameter name should be used for this? do=[AND|OR] (for
> default operator)? We have df for default field.
>
> Erik
>
--
Michael Imbeault
CHUL Research Center (CHUQ)
2705 boul. Laurier
Ste-Foy, QC, Canada, G1V 4G2
Tel: (418) 654-2705, Fax: (418) 654-2212
Re: Got it working! And some questions
Posted by Erik Hatcher <er...@ehatchersolutions.com>.
On Sep 10, 2006, at 10:47 PM, Michael Imbeault wrote:
> I'm still a little disappointed that I can't change the OR/AND
> parsing by just changing some parameter (like I can do for the
> number of results returned, for example); adding a OR between each
> word in the text i want to compare sounds suboptimal, but i'll
> probably do it that way; its a very minor nitpick, solr is awesome,
> as I said before.
I'm the one that added support for controlling the default operator
of Solr's query parser, and I hadn't considered the use case of
controlling that setting from a request parameter. It should be easy
enough to add. I'll take a look at adding that support and commit it
once I have it working.
What parameter name should be used for this? do=[AND|OR] (for
default operator)? We have df for default field.
Erik
Re: Got it working! And some questions
Posted by Michael Imbeault <mi...@sympatico.ca>.
First of all, it seems the mailing list is having some troubles? Some of
my posts end up in the wrong thread (even new threads I post), I don't
receive them in my mail, and they're present only in the 'date archive'
of http://www.mail-archive.com, and not in the 'thread' one? I don't
receive some of the other peoples post in my mail too, problems started
last week I think.
Secondly, Chris, thanks for all the useful answers, everything is much
clearer now. This info should be added to the wiki I think; should I do
it? I'm still a little disappointed that I can't change the OR/AND
parsing by just changing some parameter (like I can do for the number of
results returned, for example); adding a OR between each word in the
text i want to compare sounds suboptimal, but i'll probably do it that
way; its a very minor nitpick, solr is awesome, as I said before.
@ Brian Lucas: Don't worry, solrPHP was still 99.9% functional, great
work; part of it sending a doc at a time was my fault; I was following
the exact sequence (add to array, submit) displayed in the docs. The
only thing that could be added is a big "//TODO: change this code"
before sections you have to change to make it work for a particular
schema. I'm pretty sure the custom header curl submit works for everyone
else than me; I'm on a windows test box with WAMP on it, so it may be
caused by that. I'll send you tomorrow the changes I done to the code
anyway; as I said, nothing major.
Chris Hostetter wrote:
> : - What is the loadFactor variable of HashDocSet? Should I optimize it too?
>
> this is the same as the loadFactor in a HashMap constructor -- but i don't
> think it has much affect on performance since the HashDocSets never
> "grow".
>
> I personally have never tuned the loadFactor :)
>
> : - What's the units on the size value of the caches? Megs, number of
> : queries, kilobytes? Not described anywhere.
>
> "entries" ... the number of items allowed in the cache.
>
> : - Any way to programatically change the OR/AND preference of the query
> : parser? I set it to AND by default for user queries, but i'd like to set
> : it to OR for some server-side queries I must do (find related articles,
> : order by score).
>
> you mean using StandardRequestHandler? ... not that i can think of off the
> top of my head, but typicaly it makes sense to just configure what you
> want for your "users" in the schema, and then make any machine generated
> queries be explicit.
>
> : - Whats the difference between the 2 commits type? Blocking and
> : non-blocking. Didn't see any differences at all, tried both.
>
> do you mean the waitFlush and waitSearcher options?
> if either of those is true, you shouldn't get a response back from the
> server untill they have finished. if they are false, then the server
> should respond instantly even if it takes several seconds (or maybe even
> minutes) to complete the operation (optimizes can take a while in some
> cases -- as can opening newSearchers if you have a lot of cache warming
> configured)
>
> : - Every time I do an <optimize> command, I get the following in my
> : catalina logs - should I do anything about it?
>
> the optimize command needs to be well formed XML, try "<optimize/>"
> instead of just "<optimize>"
>
> : - Any benefits of setting the allowed memory for Tomcat higher? Right
> : now im allocating 384 megs.
>
> the more memory you've got, the more cachng you can support .. but if
> your index changes so frequently compared to the rate of *unique*
> queries you get that your caches never fill up, it may not matter.
>
>
>
>
> -Hoss
>
--
Michael Imbeault
CHUL Research Center (CHUQ)
2705 boul. Laurier
Ste-Foy, QC, Canada, G1V 4G2
Tel: (418) 654-2705, Fax: (418) 654-2212
Re: Got it working! And some questions
Posted by Chris Hostetter <ho...@fucit.org>.
: - What is the loadFactor variable of HashDocSet? Should I optimize it too?
this is the same as the loadFactor in a HashMap constructor -- but i don't
think it has much affect on performance since the HashDocSets never
"grow".
I personally have never tuned the loadFactor :)
: - What's the units on the size value of the caches? Megs, number of
: queries, kilobytes? Not described anywhere.
"entries" ... the number of items allowed in the cache.
: - Any way to programatically change the OR/AND preference of the query
: parser? I set it to AND by default for user queries, but i'd like to set
: it to OR for some server-side queries I must do (find related articles,
: order by score).
you mean using StandardRequestHandler? ... not that i can think of off the
top of my head, but typicaly it makes sense to just configure what you
want for your "users" in the schema, and then make any machine generated
queries be explicit.
: - Whats the difference between the 2 commits type? Blocking and
: non-blocking. Didn't see any differences at all, tried both.
do you mean the waitFlush and waitSearcher options?
if either of those is true, you shouldn't get a response back from the
server untill they have finished. if they are false, then the server
should respond instantly even if it takes several seconds (or maybe even
minutes) to complete the operation (optimizes can take a while in some
cases -- as can opening newSearchers if you have a lot of cache warming
configured)
: - Every time I do an <optimize> command, I get the following in my
: catalina logs - should I do anything about it?
the optimize command needs to be well formed XML, try "<optimize/>"
instead of just "<optimize>"
: - Any benefits of setting the allowed memory for Tomcat higher? Right
: now im allocating 384 megs.
the more memory you've got, the more cachng you can support .. but if
your index changes so frequently compared to the rate of *unique*
queries you get that your caches never fill up, it may not matter.
-Hoss
RE: Got it working! And some questions
Posted by Brian Lucas <bl...@gmail.com>.
Hi Michael,
I apologize for the lack of testing on the SolPHP. I had to "strip" it down
significantly to turn it into a general class that would be usable and the
version up there has not been extensively tested yet (I'm almost ready to
get back to that and "revise" it), plus much of my coding is done in Rails
at the moment. However...
If you have a new version, could you send it over my way or just upload it
to the wiki? I'd like to take a look at the changes and throw your revised
version up there or integrate both versions into a cleaner revision of the
version already there.
With respect to batch queries, it's already designed to do that (that's why
you see "array($array)" in the example, because it accepts an array of
updates) but I'd definitely like to see how you revised it.
Thanks,
Brian
-----Original Message-----
From: Michael Imbeault [mailto:michael.imbeault@sympatico.ca]
Sent: Saturday, September 09, 2006 12:30 PM
To: solr-user@lucene.apache.org
Subject: Got it working! And some questions
First of all, in reference to
http://www.mail-archive.com/solr-user@lucene.apache.org/msg00808.html ,
I got it working! The problem(s) was coming from solPHP; the
implementation in the wiki isn't really working, to be honest, at least
for me. I had to modify it significantly at multiple places to get it
working. Tomcat 5.5, WAMP and Windows XP.
The main problem was that addIndex was sending 1 doc at a time to solr;
it would cause a problem after a few thousand docs because i was running
out of resources. I modified solr_update.php to handle batch queries,
and i'm now sending batches of 1000 docs at a time. Great indexing speed.
Had a slight problem with the curl function of solr_update.php; the
custom HTTP header wasn't recognized; I now use curl_setopt($ch,
CURLOPT_POST, 1); curl_setopt($ch, CURLOPT_POSTFIELDS, $post_string); -
much simpler, and now everything works!
Up so far I indexed 15.000.000 documents (my whole collection,
basically) and the performance i'm getting is INCREDIBLE (sub 100ms
query time without warmup and no optimization at all on a 7 gigs index -
and with the cache, it gets stupid fast)! Seriously, Solr amaze me every
time I use it. I increased HashDocSet Maxsize to 75000, will continue to
optimize this value - it helped a great deal. I will try disMaxHandler
soon too; right now the standard one is great. And I will index with a
better stopword file; the default one could really use improvements.
Some questions (couldn't find the answer in the docs):
- Is the solr php in the wiki working out of the box for anyone? Else we
could modify the wiki...
- What is the loadFactor variable of HashDocSet? Should I optimize it too?
- What's the units on the size value of the caches? Megs, number of
queries, kilobytes? Not described anywhere.
- Any way to programatically change the OR/AND preference of the query
parser? I set it to AND by default for user queries, but i'd like to set
it to OR for some server-side queries I must do (find related articles,
order by score).
- Whats the difference between the 2 commits type? Blocking and
non-blocking. Didn't see any differences at all, tried both.
- Every time I do an <optimize> command, I get the following in my
catalina logs - should I do anything about it?
9-Sep-2006 2:24:40 PM org.apache.solr.core.SolrException log
SEVERE: Exception during commit/optimize:java.io.EOFException: no more
data available - expected end tag </optimize> to close start tag
<optimize> from line 1, parser stopped on START_TAG seen <optimize>... @1:10
- Any benefits of setting the allowed memory for Tomcat higher? Right
now im allocating 384 megs.
Can't wait to try the new Faceted Queries... seriously, solr is really,
really awesome up so far. Thanks for all your work, and sorry for all
the questions!
--
Michael Imbeault
CHUL Research Center (CHUQ)
2705 boul. Laurier
Ste-Foy, QC, Canada, G1V 4G2
Tel: (418) 654-2705, Fax: (418) 654-2212
Got it working! And some questions
Posted by Michael Imbeault <mi...@sympatico.ca>.
First of all, in reference to
http://www.mail-archive.com/solr-user@lucene.apache.org/msg00808.html ,
I got it working! The problem(s) was coming from solPHP; the
implementation in the wiki isn't really working, to be honest, at least
for me. I had to modify it significantly at multiple places to get it
working. Tomcat 5.5, WAMP and Windows XP.
The main problem was that addIndex was sending 1 doc at a time to solr;
it would cause a problem after a few thousand docs because i was running
out of resources. I modified solr_update.php to handle batch queries,
and i'm now sending batches of 1000 docs at a time. Great indexing speed.
Had a slight problem with the curl function of solr_update.php; the
custom HTTP header wasn't recognized; I now use curl_setopt($ch,
CURLOPT_POST, 1); curl_setopt($ch, CURLOPT_POSTFIELDS, $post_string); -
much simpler, and now everything works!
Up so far I indexed 15.000.000 documents (my whole collection,
basically) and the performance i'm getting is INCREDIBLE (sub 100ms
query time without warmup and no optimization at all on a 7 gigs index -
and with the cache, it gets stupid fast)! Seriously, Solr amaze me every
time I use it. I increased HashDocSet Maxsize to 75000, will continue to
optimize this value - it helped a great deal. I will try disMaxHandler
soon too; right now the standard one is great. And I will index with a
better stopword file; the default one could really use improvements.
Some questions (couldn't find the answer in the docs):
- Is the solr php in the wiki working out of the box for anyone? Else we
could modify the wiki...
- What is the loadFactor variable of HashDocSet? Should I optimize it too?
- What's the units on the size value of the caches? Megs, number of
queries, kilobytes? Not described anywhere.
- Any way to programatically change the OR/AND preference of the query
parser? I set it to AND by default for user queries, but i'd like to set
it to OR for some server-side queries I must do (find related articles,
order by score).
- Whats the difference between the 2 commits type? Blocking and
non-blocking. Didn't see any differences at all, tried both.
- Every time I do an <optimize> command, I get the following in my
catalina logs - should I do anything about it?
9-Sep-2006 2:24:40 PM org.apache.solr.core.SolrException log
SEVERE: Exception during commit/optimize:java.io.EOFException: no more
data available - expected end tag </optimize> to close start tag
<optimize> from line 1, parser stopped on START_TAG seen <optimize>... @1:10
- Any benefits of setting the allowed memory for Tomcat higher? Right
now im allocating 384 megs.
Can't wait to try the new Faceted Queries... seriously, solr is really,
really awesome up so far. Thanks for all your work, and sorry for all
the questions!
--
Michael Imbeault
CHUL Research Center (CHUQ)
2705 boul. Laurier
Ste-Foy, QC, Canada, G1V 4G2
Tel: (418) 654-2705, Fax: (418) 654-2212