You are viewing a plain text version of this content. The canonical link for it is here.
Posted to solr-user@lucene.apache.org by Christopher Ball <ch...@metaheuristica.com> on 2010/01/05 04:45:14 UTC

Listing Terms by Ascending IDF value . . ?

Hello,

 

I am trying to get a list of highly unusual terms or phrases (for example a
TF of 1 or 2) within an entire index (essentially this would be the inverse
of how Luke gives 'top terms' on the 'Overview' tab).

 

I see how I can do this within a specific query using the Term Vector
Component (qt=tvrh).

 

But do I have to write my own analyzer to get a list for the complete index
in ascending order?

 

Most grateful for any thoughts or insights,

 

Christopher 

 


RE: Listing Terms by Ascending IDF value . . ?

Posted by Christopher Ball <ch...@metaheuristica.com>.
Thanks - I was overlooking the Terms Component and given I can specify
terms.maxcount I can live without the ascending order.

-----Original Message-----
From: Shalin Shekhar Mangar [mailto:shalinmangar@gmail.com] 
Sent: Tuesday, January 05, 2010 2:56 AM
To: solr-user@lucene.apache.org
Subject: Re: Listing Terms by Ascending IDF value . . ?

On Tue, Jan 5, 2010 at 9:15 AM, Christopher Ball <
christopher.ball@metaheuristica.com> wrote:

> Hello,
>
> I am trying to get a list of highly unusual terms or phrases (for example
a
> TF of 1 or 2) within an entire index (essentially this would be the
inverse
> of how Luke gives 'top terms' on the 'Overview' tab).
>
> I see how I can do this within a specific query using the Term Vector
> Component (qt=tvrh).
>
>
Did you mean TermsComponent (qt=terms)?


> But do I have to write my own analyzer to get a list for the complete
index
> in ascending order?
>
>
No, you don't need a custom analyzer. But TermsComponent can only sort by
frequency in descending order or by index order (lexicographical order).

Perhaps the patch in SOLR-1672 is more suitable for your task.

-- 
Regards,
Shalin Shekhar Mangar.


Re: Reload synonyms

Posted by Chris Hostetter <ho...@fucit.org>.
: Subject: Reload synonyms
: References: <00...@cgifederal.com>
:  <69...@mail.gmail.com>
: In-Reply-To: <69...@mail.gmail.com>

http://people.apache.org/~hossman/#threadhijack
Thread Hijacking on Mailing Lists

When starting a new discussion on a mailing list, please do not reply to 
an existing message, instead start a fresh email.  Even if you change the 
subject line of your email, other mail headers still track which thread 
you replied to and your question is "hidden" in that thread and gets less 
attention.   It makes following discussions in the mailing list archives 
particularly difficult.
See Also:  http://en.wikipedia.org/wiki/User:DonDiego/Thread_hijacking



-Hoss


Re: Reload synonyms

Posted by Siddhant Goel <si...@gmail.com>.
On Tue, Jan 5, 2010 at 2:24 PM, Peter A. Kirk <pk...@alpha-solutions.dk> wrote:

> Thanks for the answer. How does one "reload" a core? Is there an API, or a
> url one can use?
>

I think this should be it - http://wiki.apache.org/solr/CoreAdmin#RELOAD

-- 
- Siddhant

RE: Reload synonyms

Posted by "Peter A. Kirk" <pk...@alpha-solutions.dk>.
Thanks for the answer. How does one "reload" a core? Is there an API, or a url one can use?


Med venlig hilsen / Best regards

Peter Kirk
E-mail: mailto:pk@alpha-solutions.dk


-----Original Message-----
From: Shalin Shekhar Mangar [mailto:shalinmangar@gmail.com] 
Sent: 5. januar 2010 21:46
To: solr-user@lucene.apache.org
Subject: Re: Reload synonyms

On Tue, Jan 5, 2010 at 2:03 PM, Peter A. Kirk <pk...@alpha-solutions.dk> wrote:

>
> Is it possible to reload the synonym list, if for example "synonyms.txt" is
> changed, without having to restart the server? Is the same possible with
> stop-words?
>
>
Yes you can reload a core but there are two catches:

   1. Reloading a core is only possible if you are using an installation
   with solr.xml (i.e. a multi core installation). The current trunk allows
   reloading a single core installation too but this was added after Solr 1.4
   was released.
   2. SynonymFilter and StopwordsFilter must not be used at index-time. If
   they are, you will need to re-index your documents after reloading the core.

-- 
Regards,
Shalin Shekhar Mangar.

No virus found in this incoming message.
Checked by AVG - www.avg.com 
Version: 9.0.725 / Virus Database: 270.14.125/2600 - Release Date: 01/05/10 08:35:00

Re: Reload synonyms

Posted by Shalin Shekhar Mangar <sh...@gmail.com>.
On Tue, Jan 5, 2010 at 2:03 PM, Peter A. Kirk <pk...@alpha-solutions.dk> wrote:

>
> Is it possible to reload the synonym list, if for example "synonyms.txt" is
> changed, without having to restart the server? Is the same possible with
> stop-words?
>
>
Yes you can reload a core but there are two catches:

   1. Reloading a core is only possible if you are using an installation
   with solr.xml (i.e. a multi core installation). The current trunk allows
   reloading a single core installation too but this was added after Solr 1.4
   was released.
   2. SynonymFilter and StopwordsFilter must not be used at index-time. If
   they are, you will need to re-index your documents after reloading the core.

-- 
Regards,
Shalin Shekhar Mangar.

Reload synonyms

Posted by "Peter A. Kirk" <pk...@alpha-solutions.dk>.
Hi

Is it possible to reload the synonym list, if for example "synonyms.txt" is changed, without having to restart the server? Is the same possible with stop-words?

Thanks,
Peter

Re: Listing Terms by Ascending IDF value . . ?

Posted by Shalin Shekhar Mangar <sh...@gmail.com>.
On Tue, Jan 5, 2010 at 9:15 AM, Christopher Ball <
christopher.ball@metaheuristica.com> wrote:

> Hello,
>
> I am trying to get a list of highly unusual terms or phrases (for example a
> TF of 1 or 2) within an entire index (essentially this would be the inverse
> of how Luke gives 'top terms' on the 'Overview' tab).
>
> I see how I can do this within a specific query using the Term Vector
> Component (qt=tvrh).
>
>
Did you mean TermsComponent (qt=terms)?


> But do I have to write my own analyzer to get a list for the complete index
> in ascending order?
>
>
No, you don't need a custom analyzer. But TermsComponent can only sort by
frequency in descending order or by index order (lexicographical order).

Perhaps the patch in SOLR-1672 is more suitable for your task.

-- 
Regards,
Shalin Shekhar Mangar.