You are viewing a plain text version of this content. The canonical link for it is here.
Posted to solr-user@lucene.apache.org by kingkong <ja...@yahoo.com> on 2011/11/08 23:41:54 UTC

Keyword counts

I'm a newbie with Solr. Is there a way to create document counts for a list
of keywords without using a facet field? For example, say I have a fruit
related web site and want to list on the main page the top fruits; apples
(23), oranges (14), pears (5), etc. 

The fruits are the "keywords" that are of interest at the moment. However,
say I begin adding documents with other keywords of interest, for example,
"bananas". 

Using a facet field, I would need to go back and re-index all the old
documents to see if any of them contain the term "banana" in order to get
accurate counts. 

Is it possible to have an Analyzer, Tokenize or Token Filter automatically
re-index as new keywords are added to a list and create the correct counts?
If so, how would I do it?

My goal is to have a keyword list, which may be updated from time to time,
and display the counts for each on the main page (without having to manually
re-index old documents).

Thank you.
Jason

--
View this message in context: http://lucene.472066.n3.nabble.com/Keyword-counts-tp3491955p3491955.html
Sent from the Solr - User mailing list archive at Nabble.com.

Re: Keyword counts

Posted by Chris Hostetter <ho...@fucit.org>.
: Thanks for the reply. There are many keyword terms (<1000?) and not sure if
: Solr would choke on a query string that long. Perhaps solr is not built to

Did you try it?

1000 facet.query params is not a strain for Solr -- but you may find 
problems with your servlet container if you try specifying them all in a 
GET request.

if this list isn't going to change very often it sounds like a perfect use 
case for specifying as "appends" request params on the request 
handler declaration in your solrconfig.xml

see the comments in solrconfig.xml for examples.

-Hoss

Re: Keyword counts

Posted by kingkong <ja...@yahoo.com>.
lol. "It's not actually as hard as it sounds". When I understand what you
said, then I may agree. :-)
Thanks again!

--
View this message in context: http://lucene.472066.n3.nabble.com/Keyword-counts-tp3491955p3497594.html
Sent from the Solr - User mailing list archive at Nabble.com.

Re: Keyword counts

Posted by Erick Erickson <er...@gmail.com>.
There's really nothing in the Solr architecture that automaticaly
reindexes anything, you have to feed docs to Solr.

You could write a custom search component that tacked this
data on to the response packet at whatever granularity you
required. It's not actually as hard as it sounds and you wouldn't
have to re-index.

Best
Erick

On Thu, Nov 10, 2011 at 12:45 PM, kingkong <ja...@yahoo.com> wrote:
> Thanks for the reply. There are many keyword terms (<1000?) and not sure if
> Solr would choke on a query string that long. Perhaps solr is not built to
> handle this type of internal re-indexing.
>
> Thank you.
>
> --
> View this message in context: http://lucene.472066.n3.nabble.com/Keyword-counts-tp3491955p3497448.html
> Sent from the Solr - User mailing list archive at Nabble.com.
>

Re: Keyword counts

Posted by kingkong <ja...@yahoo.com>.
Thanks for the reply. There are many keyword terms (<1000?) and not sure if
Solr would choke on a query string that long. Perhaps solr is not built to
handle this type of internal re-indexing. 

Thank you.   

--
View this message in context: http://lucene.472066.n3.nabble.com/Keyword-counts-tp3491955p3497448.html
Sent from the Solr - User mailing list archive at Nabble.com.

Re: Keyword counts

Posted by Erick Erickson <er...@gmail.com>.
Assuming that there aren't too many of these, you can use
facet.query=field:banana&facet.query=field:oranges etc,
repeated as many times as you need.

This gets pretty awkward if you have more than a dozen or so
facets, but you might be able to get some mileage out of
defining these as defaults in your requestHandler.

Alternatively, you could probably do something with a custom
search component that gleaned some of this info out of the
underlying Lucene index using some of the Lucene APIs.

Best
Erick

On Tue, Nov 8, 2011 at 5:41 PM, kingkong <ja...@yahoo.com> wrote:
> I'm a newbie with Solr. Is there a way to create document counts for a list
> of keywords without using a facet field? For example, say I have a fruit
> related web site and want to list on the main page the top fruits; apples
> (23), oranges (14), pears (5), etc.
>
> The fruits are the "keywords" that are of interest at the moment. However,
> say I begin adding documents with other keywords of interest, for example,
> "bananas".
>
> Using a facet field, I would need to go back and re-index all the old
> documents to see if any of them contain the term "banana" in order to get
> accurate counts.
>
> Is it possible to have an Analyzer, Tokenize or Token Filter automatically
> re-index as new keywords are added to a list and create the correct counts?
> If so, how would I do it?
>
> My goal is to have a keyword list, which may be updated from time to time,
> and display the counts for each on the main page (without having to manually
> re-index old documents).
>
> Thank you.
> Jason
>
> --
> View this message in context: http://lucene.472066.n3.nabble.com/Keyword-counts-tp3491955p3491955.html
> Sent from the Solr - User mailing list archive at Nabble.com.
>