You are viewing a plain text version of this content. The canonical link for it is here.
Posted to solr-user@lucene.apache.org by Peyman Faratin <pe...@robustlinks.com> on 2012/08/03 06:56:38 UTC
synonym file
Hi
I have a (23M) synonym file that takes a long time (3 or so minutes) to load and once included seems to adversely affect the QTime of the application by approximately 4 orders of magnitude.
Any advise on how to load faster and lower the QT would be much appreciated.
best
Peyman
Re: synonym file
Posted by Lance Norskog <go...@gmail.com>.
If you must have them a query time, you need a custom implementation
for very very large files :) If you can use these synonyms at index
time instead of query time, that would help. When you index, do not
call commit very often.
The synonym filter implementation has a feature where it only saves
the first of a set of synonyms, so you don't get term explosion.
May I ask what is the use case?
On Thu, Aug 2, 2012 at 9:56 PM, Peyman Faratin <pe...@robustlinks.com> wrote:
> Hi
>
> I have a (23M) synonym file that takes a long time (3 or so minutes) to load and once included seems to adversely affect the QTime of the application by approximately 4 orders of magnitude.
>
> Any advise on how to load faster and lower the QT would be much appreciated.
>
> best
>
> Peyman
--
Lance Norskog
goksron@gmail.com
Re: synonym file
Posted by Jack Krupansky <ja...@basetechnology.com>.
I see that the new FSTSynonymFilterFactory is only "delegated" for Lucene
3.4 and later.
I vaguely recall that there was also a recent improvement in loading of
files for filters.
-- Jack Krupansky
-----Original Message-----
From: Michael McCandless
Sent: Friday, August 03, 2012 11:32 AM
To: solr-user@lucene.apache.org
Subject: Re: synonym file
Actually FST (and SynFilter based on it) was backported to 3.x.
Mike McCandless
http://blog.mikemccandless.com
On Fri, Aug 3, 2012 at 11:28 AM, Jack Krupansky <ja...@basetechnology.com>
wrote:
> The Lucene FST guys made a big improvement in synonym filtering in
> Lucene/Solr 4.0 using FSTs. Or are you already using that?
>
> Or if you are stuck with pre-4.0, you could do a preprocessor that
> efficiently generates boolean queries for the synonym expansions. That
> should give you more decent query times, assuming you develop a decent
> synonym lookup filter.
>
> Maybe you could backport the 4.0 FST code, or at least use the same
> techniques for your own preprocessor.
>
> -- Jack Krupansky
>
> -----Original Message----- From: Peyman Faratin
> Sent: Friday, August 03, 2012 12:56 AM
> To: solr-user@lucene.apache.org
> Subject: synonym file
>
>
> Hi
>
> I have a (23M) synonym file that takes a long time (3 or so minutes) to
> load
> and once included seems to adversely affect the QTime of the application
> by
> approximately 4 orders of magnitude.
>
> Any advise on how to load faster and lower the QT would be much
> appreciated.
>
> best
>
> Peyman=
Re: synonym file
Posted by Michael McCandless <lu...@mikemccandless.com>.
Actually FST (and SynFilter based on it) was backported to 3.x.
Mike McCandless
http://blog.mikemccandless.com
On Fri, Aug 3, 2012 at 11:28 AM, Jack Krupansky <ja...@basetechnology.com> wrote:
> The Lucene FST guys made a big improvement in synonym filtering in
> Lucene/Solr 4.0 using FSTs. Or are you already using that?
>
> Or if you are stuck with pre-4.0, you could do a preprocessor that
> efficiently generates boolean queries for the synonym expansions. That
> should give you more decent query times, assuming you develop a decent
> synonym lookup filter.
>
> Maybe you could backport the 4.0 FST code, or at least use the same
> techniques for your own preprocessor.
>
> -- Jack Krupansky
>
> -----Original Message----- From: Peyman Faratin
> Sent: Friday, August 03, 2012 12:56 AM
> To: solr-user@lucene.apache.org
> Subject: synonym file
>
>
> Hi
>
> I have a (23M) synonym file that takes a long time (3 or so minutes) to load
> and once included seems to adversely affect the QTime of the application by
> approximately 4 orders of magnitude.
>
> Any advise on how to load faster and lower the QT would be much appreciated.
>
> best
>
> Peyman=
Re: synonym file
Posted by Jack Krupansky <ja...@basetechnology.com>.
The Lucene FST guys made a big improvement in synonym filtering in
Lucene/Solr 4.0 using FSTs. Or are you already using that?
Or if you are stuck with pre-4.0, you could do a preprocessor that
efficiently generates boolean queries for the synonym expansions. That
should give you more decent query times, assuming you develop a decent
synonym lookup filter.
Maybe you could backport the 4.0 FST code, or at least use the same
techniques for your own preprocessor.
-- Jack Krupansky
-----Original Message-----
From: Peyman Faratin
Sent: Friday, August 03, 2012 12:56 AM
To: solr-user@lucene.apache.org
Subject: synonym file
Hi
I have a (23M) synonym file that takes a long time (3 or so minutes) to load
and once included seems to adversely affect the QTime of the application by
approximately 4 orders of magnitude.
Any advise on how to load faster and lower the QT would be much appreciated.
best
Peyman=