You are viewing a plain text version of this content. The canonical link for it is here.
Posted to solr-user@lucene.apache.org by Peyman Faratin <pe...@robustlinks.com> on 2012/08/03 06:56:38 UTC

synonym file

Hi

I have a (23M) synonym file that takes a long time (3 or so minutes) to load and once included seems to adversely affect the QTime of the application by approximately 4 orders of magnitude. 

Any advise on how to load faster and lower the QT would be much appreciated. 

best

Peyman

Re: synonym file

Posted by Lance Norskog <go...@gmail.com>.
If you must have them a query time, you need a custom implementation
for very very large files :) If you can use these synonyms at index
time instead of query time, that would help. When you index, do not
call commit very often.

The synonym filter implementation has a feature where it only saves
the first of a set of synonyms, so you don't get term explosion.

May I ask what is the use case?

On Thu, Aug 2, 2012 at 9:56 PM, Peyman Faratin <pe...@robustlinks.com> wrote:
> Hi
>
> I have a (23M) synonym file that takes a long time (3 or so minutes) to load and once included seems to adversely affect the QTime of the application by approximately 4 orders of magnitude.
>
> Any advise on how to load faster and lower the QT would be much appreciated.
>
> best
>
> Peyman



-- 
Lance Norskog
goksron@gmail.com

Re: synonym file

Posted by Jack Krupansky <ja...@basetechnology.com>.
I see that the new FSTSynonymFilterFactory is only "delegated" for Lucene 
3.4 and later.

I vaguely recall that there was also a recent improvement in loading of 
files for filters.

-- Jack Krupansky

-----Original Message----- 
From: Michael McCandless
Sent: Friday, August 03, 2012 11:32 AM
To: solr-user@lucene.apache.org
Subject: Re: synonym file

Actually FST (and SynFilter based on it) was backported to 3.x.

Mike McCandless

http://blog.mikemccandless.com

On Fri, Aug 3, 2012 at 11:28 AM, Jack Krupansky <ja...@basetechnology.com> 
wrote:
> The Lucene FST guys made a big improvement in synonym filtering in
> Lucene/Solr 4.0 using FSTs. Or are you already using that?
>
> Or if you are stuck with pre-4.0, you could do a preprocessor that
> efficiently generates boolean queries for the synonym expansions. That
> should give you more decent query times, assuming you develop a decent
> synonym lookup filter.
>
> Maybe you could backport the 4.0 FST code, or at least use the same
> techniques for your own preprocessor.
>
> -- Jack Krupansky
>
> -----Original Message----- From: Peyman Faratin
> Sent: Friday, August 03, 2012 12:56 AM
> To: solr-user@lucene.apache.org
> Subject: synonym file
>
>
> Hi
>
> I have a (23M) synonym file that takes a long time (3 or so minutes) to 
> load
> and once included seems to adversely affect the QTime of the application 
> by
> approximately 4 orders of magnitude.
>
> Any advise on how to load faster and lower the QT would be much 
> appreciated.
>
> best
>
> Peyman= 


Re: synonym file

Posted by Michael McCandless <lu...@mikemccandless.com>.
Actually FST (and SynFilter based on it) was backported to 3.x.

Mike McCandless

http://blog.mikemccandless.com

On Fri, Aug 3, 2012 at 11:28 AM, Jack Krupansky <ja...@basetechnology.com> wrote:
> The Lucene FST guys made a big improvement in synonym filtering in
> Lucene/Solr 4.0 using FSTs. Or are you already using that?
>
> Or if you are stuck with pre-4.0, you could do a preprocessor that
> efficiently generates boolean queries for the synonym expansions. That
> should give you more decent query times, assuming you develop a decent
> synonym lookup filter.
>
> Maybe you could backport the 4.0 FST code, or at least use the same
> techniques for your own preprocessor.
>
> -- Jack Krupansky
>
> -----Original Message----- From: Peyman Faratin
> Sent: Friday, August 03, 2012 12:56 AM
> To: solr-user@lucene.apache.org
> Subject: synonym file
>
>
> Hi
>
> I have a (23M) synonym file that takes a long time (3 or so minutes) to load
> and once included seems to adversely affect the QTime of the application by
> approximately 4 orders of magnitude.
>
> Any advise on how to load faster and lower the QT would be much appreciated.
>
> best
>
> Peyman=

Re: synonym file

Posted by Jack Krupansky <ja...@basetechnology.com>.
The Lucene FST guys made a big improvement in synonym filtering in 
Lucene/Solr 4.0 using FSTs. Or are you already using that?

Or if you are stuck with pre-4.0, you could do a preprocessor that 
efficiently generates boolean queries for the synonym expansions. That 
should give you more decent query times, assuming you develop a decent 
synonym lookup filter.

Maybe you could backport the 4.0 FST code, or at least use the same 
techniques for your own preprocessor.

-- Jack Krupansky

-----Original Message----- 
From: Peyman Faratin
Sent: Friday, August 03, 2012 12:56 AM
To: solr-user@lucene.apache.org
Subject: synonym file

Hi

I have a (23M) synonym file that takes a long time (3 or so minutes) to load 
and once included seems to adversely affect the QTime of the application by 
approximately 4 orders of magnitude.

Any advise on how to load faster and lower the QT would be much appreciated.

best

Peyman=