You are viewing a plain text version of this content. The canonical link for it is here.
Posted to solr-dev@lucene.apache.org by Emmanuel Castro Santana <em...@gmail.com> on 2009/02/26 18:50:59 UTC
Is there a built in keyword report (Tag Cloud) feature on Solr ?
I am developing a Solr based search application and need to get a kind of a
keyword report for tag cloud generation. If there is anyone here who has
ever had that necessity and has somehow found the way through, I would
really appreciate some help.
Thanks in advance
--
View this message in context: http://www.nabble.com/Is-there-a-built-in-keyword-report-%28Tag-Cloud%29-feature-on-Solr---tp22229677p22229677.html
Sent from the Solr - Dev mailing list archive at Nabble.com.
Re: Is there a built in keyword report (Tag Cloud) feature on Solr
?
Posted by Emmanuel Castro Santana <em...@gmail.com>.
Thanks for all the information, it is being really useful.
I didn't know that there were different names for tag clouds, that is also
good to know !
I don't feel really comfortable about having search-cloud like information
on the index. We would like to have those concerns separated and for this
purpose I think the best way would be developing a Request handler or a
component to be used inside any Request Handler to store all the query
information for search-cloud generation. I have also taken a look at this
TermVectorComponent, don't know if it would help me in this issue, anyway it
may be useful sometime.
Thanks
Aleksander M. Stensby wrote:
>
> Sorry, that mail got stuck in my outbox. Anyways. On a side-note, i think
> it is called a search-cloud when refering to top-searches, and a tag-cloud
> when refering to top-occuring terms in the corpus, as Chris said.
>
> Since you are only after creating a search-cloud, i think my answer is a
> pretty straightforward and simple (and fast) approach to doing so.
> And as Chris mentions, if you want to create a tag cloud with those words
> that are a.) occuring frequently in the corpus, or b.) more advanced,
> those terms that are actually "important" to your corpus (score-based /
> tf-idf etc.) you can simply use the TermsComponent. As the trunk version
> of Solr introduces the TermVectorComponent, you can also retrieve
> information for specific search-results etc.
>
> Another thing you could do with your search-cloud is to for instance add a
> date-dimension to the solr-index (where you store all the queries), and
> then out of the box you get the possibility of creating
> evolving-search-clouds! I.e., you can visualize how "what is being
> searched for" changes over time! -> now thats a neat feature :) And best
> of all - Solr gives you this for free with facets once you have those
> queries indexed :)
>
> Hope that helps!
>
> Best regards,
> Aleksander
>
>
> On Fri, 27 Feb 2009 08:12:19 +0100, Aleksander M. Stensby
> <al...@integrasco.no> wrote:
>
>> To do that, your best option is to do it "outside" of solr. I.e., when
>> someone enters a query in your webapplication, you store the search in
>> for instance a db (or even in a separate solr-index).
>> If you go with a solr-index for "queries", you can simply do facets on
>> the queries and for instance a facet.limit=50 (which will give you the
>> top 50 most frequently entered queries).
>>
>> - Aleksander
>>
>> On Thu, 26 Feb 2009 19:35:49 +0100, Emmanuel Castro Santana
>> <em...@gmail.com> wrote:
>>
>>>
>>> Thanks the help
>>>
>>> "... do a *:* search and then make tag clouds from all of the facets
>>> ..."
>>>
>>> I may have not made myself clear. When I say keyword report, I mean a
>>> kind
>>> of a most popular tag cloud, showing in bigger sizes the most searched
>>> terms. Therefore I need information about how many times specific terms
>>> have
>>> been searched and I can't see how I could accomplish that with this
>>> solution....
>>>
>>>
>>>
>>> Walter Underwood wrote:
>>>>
>>>> Oops, missed that you wanted it by facet. Never mind. --wunder
>>>>
>>>> On 2/26/09 9:57 AM, "Walter Underwood" <wu...@netflix.com> wrote:
>>>>
>>>>> That info is already available via Luke, right? --wunder
>>>>>
>>>>> On 2/26/09 9:55 AM, "Robert Douglass" <ro...@robshouse.net> wrote:
>>>>>
>>>>>> A solution that I'd considering implementing for Drupal's ApacheSolr
>>>>>> module is to do a *:* search and then make tag clouds from all of the
>>>>>> facets. Pretty easy to sort all the facet terms into bins based on
>>>>>> the
>>>>>> number of documents they match, and then to translate bins to font
>>>>>> sizes. Tag clouds make a nice alternate representation of facet
>>>>>> blocks.
>>>>>>
>>>>>> Robert Douglass
>>>>>>
>>>>>> The RobsHouse.net Newsletter:
>>>>>> http://robshouse.net/newsletter/robshousenet-newsletter
>>>>>> Follow me on Twitter: http://twitter.com/robertDouglass
>>>>>>
>>>>>> On Feb 26, 2009, at 6:50 PM, Emmanuel Castro Santana wrote:
>>>>>>
>>>>>>>
>>>>>>> I am developing a Solr based search application and need to get a
>>>>>>> kind of a
>>>>>>> keyword report for tag cloud generation. If there is anyone here who
>>>>>>> has
>>>>>>> ever had that necessity and has somehow found the way through, I
>>>>>>> would
>>>>>>> really appreciate some help.
>>>>>>> Thanks in advance
>>>>>>> --
>>>>>>> View this message in context:
>>>>>>>
>>>> http://www.nabble.com/Is-there-a-built-in-keyword-report-%28Tag-Cloud%29-fea>>>
>>>> t
>>>>>>> ure-on-Solr---tp22229677p22229677.html
>>>>>>> Sent from the Solr - Dev mailing list archive at Nabble.com.
>>>>>>>
>>>>>>
>>>>>
>>>>
>>>>
>>>>
>>>
>>
>>
>>
>
>
>
> --
> Aleksander M. Stensby
> Senior software developer
> Integrasco A/S
> www.integrasco.no
>
> Please consider the environment before printing all or any of this e-mail
>
>
--
View this message in context: http://www.nabble.com/Is-there-a-built-in-keyword-report-%28Tag-Cloud%29-feature-on-Solr---tp22229677p22251335.html
Sent from the Solr - Dev mailing list archive at Nabble.com.
Re: Is there a built in keyword report (Tag Cloud) feature on Solr ?
Posted by "Aleksander M. Stensby" <al...@integrasco.no>.
Sorry, that mail got stuck in my outbox. Anyways. On a side-note, i think
it is called a search-cloud when refering to top-searches, and a tag-cloud
when refering to top-occuring terms in the corpus, as Chris said.
Since you are only after creating a search-cloud, i think my answer is a
pretty straightforward and simple (and fast) approach to doing so.
And as Chris mentions, if you want to create a tag cloud with those words
that are a.) occuring frequently in the corpus, or b.) more advanced,
those terms that are actually "important" to your corpus (score-based /
tf-idf etc.) you can simply use the TermsComponent. As the trunk version
of Solr introduces the TermVectorComponent, you can also retrieve
information for specific search-results etc.
Another thing you could do with your search-cloud is to for instance add a
date-dimension to the solr-index (where you store all the queries), and
then out of the box you get the possibility of creating
evolving-search-clouds! I.e., you can visualize how "what is being
searched for" changes over time! -> now thats a neat feature :) And best
of all - Solr gives you this for free with facets once you have those
queries indexed :)
Hope that helps!
Best regards,
Aleksander
On Fri, 27 Feb 2009 08:12:19 +0100, Aleksander M. Stensby
<al...@integrasco.no> wrote:
> To do that, your best option is to do it "outside" of solr. I.e., when
> someone enters a query in your webapplication, you store the search in
> for instance a db (or even in a separate solr-index).
> If you go with a solr-index for "queries", you can simply do facets on
> the queries and for instance a facet.limit=50 (which will give you the
> top 50 most frequently entered queries).
>
> - Aleksander
>
> On Thu, 26 Feb 2009 19:35:49 +0100, Emmanuel Castro Santana
> <em...@gmail.com> wrote:
>
>>
>> Thanks the help
>>
>> "... do a *:* search and then make tag clouds from all of the facets
>> ..."
>>
>> I may have not made myself clear. When I say keyword report, I mean a
>> kind
>> of a most popular tag cloud, showing in bigger sizes the most searched
>> terms. Therefore I need information about how many times specific terms
>> have
>> been searched and I can't see how I could accomplish that with this
>> solution....
>>
>>
>>
>> Walter Underwood wrote:
>>>
>>> Oops, missed that you wanted it by facet. Never mind. --wunder
>>>
>>> On 2/26/09 9:57 AM, "Walter Underwood" <wu...@netflix.com> wrote:
>>>
>>>> That info is already available via Luke, right? --wunder
>>>>
>>>> On 2/26/09 9:55 AM, "Robert Douglass" <ro...@robshouse.net> wrote:
>>>>
>>>>> A solution that I'd considering implementing for Drupal's ApacheSolr
>>>>> module is to do a *:* search and then make tag clouds from all of the
>>>>> facets. Pretty easy to sort all the facet terms into bins based on
>>>>> the
>>>>> number of documents they match, and then to translate bins to font
>>>>> sizes. Tag clouds make a nice alternate representation of facet
>>>>> blocks.
>>>>>
>>>>> Robert Douglass
>>>>>
>>>>> The RobsHouse.net Newsletter:
>>>>> http://robshouse.net/newsletter/robshousenet-newsletter
>>>>> Follow me on Twitter: http://twitter.com/robertDouglass
>>>>>
>>>>> On Feb 26, 2009, at 6:50 PM, Emmanuel Castro Santana wrote:
>>>>>
>>>>>>
>>>>>> I am developing a Solr based search application and need to get a
>>>>>> kind of a
>>>>>> keyword report for tag cloud generation. If there is anyone here who
>>>>>> has
>>>>>> ever had that necessity and has somehow found the way through, I
>>>>>> would
>>>>>> really appreciate some help.
>>>>>> Thanks in advance
>>>>>> --
>>>>>> View this message in context:
>>>>>>
>>> http://www.nabble.com/Is-there-a-built-in-keyword-report-%28Tag-Cloud%29-fea>>>
>>> t
>>>>>> ure-on-Solr---tp22229677p22229677.html
>>>>>> Sent from the Solr - Dev mailing list archive at Nabble.com.
>>>>>>
>>>>>
>>>>
>>>
>>>
>>>
>>
>
>
>
--
Aleksander M. Stensby
Senior software developer
Integrasco A/S
www.integrasco.no
Please consider the environment before printing all or any of this e-mail
Re: Is there a built in keyword report (Tag Cloud) feature on Solr ?
Posted by "Aleksander M. Stensby" <al...@integrasco.no>.
To do that, your best option is to do it "outside" of solr. I.e., when
someone enters a query in your webapplication, you store the search in for
instance a db (or even in a separate solr-index).
If you go with a solr-index for "queries", you can simply do facets on the
queries and for instance a facet.limit=50 (which will give you the top 50
most frequently entered queries).
- Aleksander
On Thu, 26 Feb 2009 19:35:49 +0100, Emmanuel Castro Santana
<em...@gmail.com> wrote:
>
> Thanks the help
>
> "... do a *:* search and then make tag clouds from all of the facets ..."
>
> I may have not made myself clear. When I say keyword report, I mean a
> kind
> of a most popular tag cloud, showing in bigger sizes the most searched
> terms. Therefore I need information about how many times specific terms
> have
> been searched and I can't see how I could accomplish that with this
> solution....
>
>
>
> Walter Underwood wrote:
>>
>> Oops, missed that you wanted it by facet. Never mind. --wunder
>>
>> On 2/26/09 9:57 AM, "Walter Underwood" <wu...@netflix.com> wrote:
>>
>>> That info is already available via Luke, right? --wunder
>>>
>>> On 2/26/09 9:55 AM, "Robert Douglass" <ro...@robshouse.net> wrote:
>>>
>>>> A solution that I'd considering implementing for Drupal's ApacheSolr
>>>> module is to do a *:* search and then make tag clouds from all of the
>>>> facets. Pretty easy to sort all the facet terms into bins based on the
>>>> number of documents they match, and then to translate bins to font
>>>> sizes. Tag clouds make a nice alternate representation of facet
>>>> blocks.
>>>>
>>>> Robert Douglass
>>>>
>>>> The RobsHouse.net Newsletter:
>>>> http://robshouse.net/newsletter/robshousenet-newsletter
>>>> Follow me on Twitter: http://twitter.com/robertDouglass
>>>>
>>>> On Feb 26, 2009, at 6:50 PM, Emmanuel Castro Santana wrote:
>>>>
>>>>>
>>>>> I am developing a Solr based search application and need to get a
>>>>> kind of a
>>>>> keyword report for tag cloud generation. If there is anyone here who
>>>>> has
>>>>> ever had that necessity and has somehow found the way through, I
>>>>> would
>>>>> really appreciate some help.
>>>>> Thanks in advance
>>>>> --
>>>>> View this message in context:
>>>>>
>> http://www.nabble.com/Is-there-a-built-in-keyword-report-%28Tag-Cloud%29-fea>>>
>> t
>>>>> ure-on-Solr---tp22229677p22229677.html
>>>>> Sent from the Solr - Dev mailing list archive at Nabble.com.
>>>>>
>>>>
>>>
>>
>>
>>
>
--
Aleksander M. Stensby
Senior software developer
Integrasco A/S
www.integrasco.no
Please consider the environment before printing all or any of this e-mail
Re: Is there a built in keyword report (Tag Cloud) feature on Solr
?
Posted by Emmanuel Castro Santana <em...@gmail.com>.
Sorry for that. Most searched terms tag cloud is kind of common around here.
"Solr doesn't keep any record of the searches performed, so to build a tag
cloud based on query popularity you would need to mine your logs."
Do you know if there is already a tool or a Solr plugin for that ?
Thanks
hossman wrote:
>
>
> : I may have not made myself clear. When I say keyword report, I mean a
> kind
> : of a most popular tag cloud, showing in bigger sizes the most searched
> : terms. Therefore I need information about how many times specific terms
> have
> : been searched and I can't see how I could accomplish that with this
> : solution....
>
> you have to be more explicit about what you ask for. I've never heard
> anyone refer to a tag cloud as being based on how often a term is searched
> for -- everyone i know uses the frequency of words in the corpus,
> sometimes with a decay function to promote words mentioned in more recent
> docs.
>
> Solr doesn't keep any record of the searches performed, so to build a tag
> cloud based on query popularity you would need to mine your logs.
>
> if you want a tag cloud based on the frequency of words in your corpus,
> the faceting approach mentioned would work -- but a simpler way to get
> term counts for the whole index (*:*) would be the TermsComponent. you
> only really need the facet based solution if you want a cloud based on a
> subset of documents, (ie: a cloud for all documents matching
> category:computer)
>
>
>
> -Hoss
>
>
>
--
View this message in context: http://www.nabble.com/Is-there-a-built-in-keyword-report-%28Tag-Cloud%29-feature-on-Solr---tp22229677p22236934.html
Sent from the Solr - Dev mailing list archive at Nabble.com.
Re: Is there a built in keyword report (Tag Cloud) feature on Solr ?
Posted by Walter Underwood <wu...@netflix.com>.
If you want a tag cloud based on query freqency, start with your
HTTP log analysis tools. Most of those generate a list of top
queries and top words in queries.
wunder
On 2/26/09 2:54 PM, "Chris Hostetter" <ho...@fucit.org> wrote:
>
> : I may have not made myself clear. When I say keyword report, I mean a kind
> : of a most popular tag cloud, showing in bigger sizes the most searched
> : terms. Therefore I need information about how many times specific terms have
> : been searched and I can't see how I could accomplish that with this
> : solution....
>
> you have to be more explicit about what you ask for. I've never heard
> anyone refer to a tag cloud as being based on how often a term is searched
> for -- everyone i know uses the frequency of words in the corpus,
> sometimes with a decay function to promote words mentioned in more recent
> docs.
>
> Solr doesn't keep any record of the searches performed, so to build a tag
> cloud based on query popularity you would need to mine your logs.
>
> if you want a tag cloud based on the frequency of words in your corpus,
> the faceting approach mentioned would work -- but a simpler way to get
> term counts for the whole index (*:*) would be the TermsComponent. you
> only really need the facet based solution if you want a cloud based on a
> subset of documents, (ie: a cloud for all documents matching
> category:computer)
>
>
>
> -Hoss
>
Re: Is there a built in keyword report (Tag Cloud) feature on Solr
?
Posted by Chris Hostetter <ho...@fucit.org>.
: I may have not made myself clear. When I say keyword report, I mean a kind
: of a most popular tag cloud, showing in bigger sizes the most searched
: terms. Therefore I need information about how many times specific terms have
: been searched and I can't see how I could accomplish that with this
: solution....
you have to be more explicit about what you ask for. I've never heard
anyone refer to a tag cloud as being based on how often a term is searched
for -- everyone i know uses the frequency of words in the corpus,
sometimes with a decay function to promote words mentioned in more recent
docs.
Solr doesn't keep any record of the searches performed, so to build a tag
cloud based on query popularity you would need to mine your logs.
if you want a tag cloud based on the frequency of words in your corpus,
the faceting approach mentioned would work -- but a simpler way to get
term counts for the whole index (*:*) would be the TermsComponent. you
only really need the facet based solution if you want a cloud based on a
subset of documents, (ie: a cloud for all documents matching
category:computer)
-Hoss
Re: Is there a built in keyword report (Tag Cloud) feature on Solr
?
Posted by Emmanuel Castro Santana <em...@gmail.com>.
Thanks the help
"... do a *:* search and then make tag clouds from all of the facets ..."
I may have not made myself clear. When I say keyword report, I mean a kind
of a most popular tag cloud, showing in bigger sizes the most searched
terms. Therefore I need information about how many times specific terms have
been searched and I can't see how I could accomplish that with this
solution....
Walter Underwood wrote:
>
> Oops, missed that you wanted it by facet. Never mind. --wunder
>
> On 2/26/09 9:57 AM, "Walter Underwood" <wu...@netflix.com> wrote:
>
>> That info is already available via Luke, right? --wunder
>>
>> On 2/26/09 9:55 AM, "Robert Douglass" <ro...@robshouse.net> wrote:
>>
>>> A solution that I'd considering implementing for Drupal's ApacheSolr
>>> module is to do a *:* search and then make tag clouds from all of the
>>> facets. Pretty easy to sort all the facet terms into bins based on the
>>> number of documents they match, and then to translate bins to font
>>> sizes. Tag clouds make a nice alternate representation of facet blocks.
>>>
>>> Robert Douglass
>>>
>>> The RobsHouse.net Newsletter:
>>> http://robshouse.net/newsletter/robshousenet-newsletter
>>> Follow me on Twitter: http://twitter.com/robertDouglass
>>>
>>> On Feb 26, 2009, at 6:50 PM, Emmanuel Castro Santana wrote:
>>>
>>>>
>>>> I am developing a Solr based search application and need to get a
>>>> kind of a
>>>> keyword report for tag cloud generation. If there is anyone here who
>>>> has
>>>> ever had that necessity and has somehow found the way through, I would
>>>> really appreciate some help.
>>>> Thanks in advance
>>>> --
>>>> View this message in context:
>>>>
> http://www.nabble.com/Is-there-a-built-in-keyword-report-%28Tag-Cloud%29-fea>>>
> t
>>>> ure-on-Solr---tp22229677p22229677.html
>>>> Sent from the Solr - Dev mailing list archive at Nabble.com.
>>>>
>>>
>>
>
>
>
--
View this message in context: http://www.nabble.com/Is-there-a-built-in-keyword-report-%28Tag-Cloud%29-feature-on-Solr---tp22229677p22230655.html
Sent from the Solr - Dev mailing list archive at Nabble.com.
Re: Is there a built in keyword report (Tag Cloud) feature on Solr ?
Posted by Walter Underwood <wu...@netflix.com>.
Oops, missed that you wanted it by facet. Never mind. --wunder
On 2/26/09 9:57 AM, "Walter Underwood" <wu...@netflix.com> wrote:
> That info is already available via Luke, right? --wunder
>
> On 2/26/09 9:55 AM, "Robert Douglass" <ro...@robshouse.net> wrote:
>
>> A solution that I'd considering implementing for Drupal's ApacheSolr
>> module is to do a *:* search and then make tag clouds from all of the
>> facets. Pretty easy to sort all the facet terms into bins based on the
>> number of documents they match, and then to translate bins to font
>> sizes. Tag clouds make a nice alternate representation of facet blocks.
>>
>> Robert Douglass
>>
>> The RobsHouse.net Newsletter:
>> http://robshouse.net/newsletter/robshousenet-newsletter
>> Follow me on Twitter: http://twitter.com/robertDouglass
>>
>> On Feb 26, 2009, at 6:50 PM, Emmanuel Castro Santana wrote:
>>
>>>
>>> I am developing a Solr based search application and need to get a
>>> kind of a
>>> keyword report for tag cloud generation. If there is anyone here who
>>> has
>>> ever had that necessity and has somehow found the way through, I would
>>> really appreciate some help.
>>> Thanks in advance
>>> --
>>> View this message in context:
>>>
http://www.nabble.com/Is-there-a-built-in-keyword-report-%28Tag-Cloud%29-fea>>>
t
>>> ure-on-Solr---tp22229677p22229677.html
>>> Sent from the Solr - Dev mailing list archive at Nabble.com.
>>>
>>
>
Re: Is there a built in keyword report (Tag Cloud) feature on Solr ?
Posted by Walter Underwood <wu...@netflix.com>.
That info is already available via Luke, right? --wunder
On 2/26/09 9:55 AM, "Robert Douglass" <ro...@robshouse.net> wrote:
> A solution that I'd considering implementing for Drupal's ApacheSolr
> module is to do a *:* search and then make tag clouds from all of the
> facets. Pretty easy to sort all the facet terms into bins based on the
> number of documents they match, and then to translate bins to font
> sizes. Tag clouds make a nice alternate representation of facet blocks.
>
> Robert Douglass
>
> The RobsHouse.net Newsletter:
> http://robshouse.net/newsletter/robshousenet-newsletter
> Follow me on Twitter: http://twitter.com/robertDouglass
>
> On Feb 26, 2009, at 6:50 PM, Emmanuel Castro Santana wrote:
>
>>
>> I am developing a Solr based search application and need to get a
>> kind of a
>> keyword report for tag cloud generation. If there is anyone here who
>> has
>> ever had that necessity and has somehow found the way through, I would
>> really appreciate some help.
>> Thanks in advance
>> --
>> View this message in context:
>> http://www.nabble.com/Is-there-a-built-in-keyword-report-%28Tag-Cloud%29-feat
>> ure-on-Solr---tp22229677p22229677.html
>> Sent from the Solr - Dev mailing list archive at Nabble.com.
>>
>
Re: Is there a built in keyword report (Tag Cloud) feature on Solr ?
Posted by Robert Douglass <ro...@robshouse.net>.
A solution that I'd considering implementing for Drupal's ApacheSolr
module is to do a *:* search and then make tag clouds from all of the
facets. Pretty easy to sort all the facet terms into bins based on the
number of documents they match, and then to translate bins to font
sizes. Tag clouds make a nice alternate representation of facet blocks.
Robert Douglass
The RobsHouse.net Newsletter: http://robshouse.net/newsletter/robshousenet-newsletter
Follow me on Twitter: http://twitter.com/robertDouglass
On Feb 26, 2009, at 6:50 PM, Emmanuel Castro Santana wrote:
>
> I am developing a Solr based search application and need to get a
> kind of a
> keyword report for tag cloud generation. If there is anyone here who
> has
> ever had that necessity and has somehow found the way through, I would
> really appreciate some help.
> Thanks in advance
> --
> View this message in context: http://www.nabble.com/Is-there-a-built-in-keyword-report-%28Tag-Cloud%29-feature-on-Solr---tp22229677p22229677.html
> Sent from the Solr - Dev mailing list archive at Nabble.com.
>