You are viewing a plain text version of this content. The canonical link for it is here.
Posted to solr-user@lucene.apache.org by Chris Dawson <xr...@gmail.com> on 2012/08/02 17:34:32 UTC

Trending topics?

How would I generate a list of trending topics using solr?

Chris

Re: Trending topics?

Posted by Otis Gospodnetic <ot...@yahoo.com>.
Chris,

I'm not sure if Solr by itself can really do this (easily and/or well).
Have a look at http://sematext.com/products/key-phrase-extractor/index.html which can do exactly that, but without Solr.  Some of the highlighted bits refer to trending topics, though not using exactly that terminology.

Otis 
----
Performance Monitoring for Solr / ElasticSearch / HBase - http://sematext.com/spm 



>________________________________
> From: Chris Dawson <xr...@gmail.com>
>To: solr-user@lucene.apache.org 
>Sent: Thursday, August 2, 2012 11:34 AM
>Subject: Trending topics?
> 
>How would I generate a list of trending topics using solr?
>
>Chris
>
>
>

Re: Trending topics?

Posted by Hasan Diwan <ha...@gmail.com>.
Tor,
I hope that the information in
http://www.jason-palmer.com/2011/05/creating-a-tag-cloud-with-solr-and-php/
helps.. -- H
On 2 August 2012 15:48, Lance Norskog <go...@gmail.com> wrote:

> Two easy ones:
> 1) Facets on a text field are simple word counts by document.
> 2) If you want the number of times a word appears inside a document,
> that requires a separate dataset called a 'term vector'. This is a
> list of all words in a document with a count for each one.
> These are simple queries. There are also batch computations where you
> create a 'term-document matrix', with a row for each document and a
> column for all terms that appear in any document. These computations
> require exporting all of your data into a separate computation.
>
>
>
> On Thu, Aug 2, 2012 at 1:26 PM, Chris Dawson <xr...@gmail.com> wrote:
> > Tor,
> >
> > Thanks for your response.
> >
> > I'd like to put an arbitrary set of text into Solr and then have Solr
> tell
> > me the ten most popular "topics" that are in there.  For example, if I
> put
> > in 100 paragraphs of text about sports, I would like to retrieve topics
> > like "swimming, basketball, tennis" if the three most popular and
> discussed
> > topics are those inside the text.
> >
> > Is Solr the correct tool to do something like this?  Or, is this too
> > unstructured to get this kind of result without manually categorizing it?
> >
> > Is the correct term for this faceting?  It seems to me that faceting
> > requires putting the data into a more structured format (for example,
> > telling the index that this is the "manufacturer", etc.)
> >
> > Basically, I would like to get something like a tag cloud (relevant
> topics
> > with weights for each term) without asking users to tag things manually.
> >
> > Chris
> >
> > On Thu, Aug 2, 2012 at 3:25 PM, Tor Henning Ueland <
> tor.henning@gmail.com>wrote:
> >
> >> On Thu, Aug 2, 2012 at 5:34 PM, Chris Dawson <xr...@gmail.com>
> wrote:
> >>
> >> > How would I generate a list of trending topics using solr?
> >> >
> >>
> >>
> >> By putting them in solr.
> >> (Generic question get at generic answer)
> >>
> >> What do you mean? Trending searches, trending data, trending documents,
> >> trending what?
> >>
> >>
> >> --
> >> Regards
> >> Tor Henning Ueland
> >>
> >
> >
> >
> > --
> > Chris Dawson
> > 971-533-8335
> > Human potential, travel and entrepreneurship:  http://webiphany.com/
> > Traveling to Portland, OR?  http://www.airbnb.com/rooms/58909
>
>
>
> --
> Lance Norskog
> goksron@gmail.com
>



-- 
Sent from my mobile device
Envoyait de mon portable

Re: Trending topics?

Posted by Lance Norskog <go...@gmail.com>.
Two easy ones:
1) Facets on a text field are simple word counts by document.
2) If you want the number of times a word appears inside a document,
that requires a separate dataset called a 'term vector'. This is a
list of all words in a document with a count for each one.
These are simple queries. There are also batch computations where you
create a 'term-document matrix', with a row for each document and a
column for all terms that appear in any document. These computations
require exporting all of your data into a separate computation.



On Thu, Aug 2, 2012 at 1:26 PM, Chris Dawson <xr...@gmail.com> wrote:
> Tor,
>
> Thanks for your response.
>
> I'd like to put an arbitrary set of text into Solr and then have Solr tell
> me the ten most popular "topics" that are in there.  For example, if I put
> in 100 paragraphs of text about sports, I would like to retrieve topics
> like "swimming, basketball, tennis" if the three most popular and discussed
> topics are those inside the text.
>
> Is Solr the correct tool to do something like this?  Or, is this too
> unstructured to get this kind of result without manually categorizing it?
>
> Is the correct term for this faceting?  It seems to me that faceting
> requires putting the data into a more structured format (for example,
> telling the index that this is the "manufacturer", etc.)
>
> Basically, I would like to get something like a tag cloud (relevant topics
> with weights for each term) without asking users to tag things manually.
>
> Chris
>
> On Thu, Aug 2, 2012 at 3:25 PM, Tor Henning Ueland <to...@gmail.com>wrote:
>
>> On Thu, Aug 2, 2012 at 5:34 PM, Chris Dawson <xr...@gmail.com> wrote:
>>
>> > How would I generate a list of trending topics using solr?
>> >
>>
>>
>> By putting them in solr.
>> (Generic question get at generic answer)
>>
>> What do you mean? Trending searches, trending data, trending documents,
>> trending what?
>>
>>
>> --
>> Regards
>> Tor Henning Ueland
>>
>
>
>
> --
> Chris Dawson
> 971-533-8335
> Human potential, travel and entrepreneurship:  http://webiphany.com/
> Traveling to Portland, OR?  http://www.airbnb.com/rooms/58909



-- 
Lance Norskog
goksron@gmail.com

Re: Trending topics?

Posted by Chris Dawson <xr...@gmail.com>.
Tor,

Thanks for your response.

I'd like to put an arbitrary set of text into Solr and then have Solr tell
me the ten most popular "topics" that are in there.  For example, if I put
in 100 paragraphs of text about sports, I would like to retrieve topics
like "swimming, basketball, tennis" if the three most popular and discussed
topics are those inside the text.

Is Solr the correct tool to do something like this?  Or, is this too
unstructured to get this kind of result without manually categorizing it?

Is the correct term for this faceting?  It seems to me that faceting
requires putting the data into a more structured format (for example,
telling the index that this is the "manufacturer", etc.)

Basically, I would like to get something like a tag cloud (relevant topics
with weights for each term) without asking users to tag things manually.

Chris

On Thu, Aug 2, 2012 at 3:25 PM, Tor Henning Ueland <to...@gmail.com>wrote:

> On Thu, Aug 2, 2012 at 5:34 PM, Chris Dawson <xr...@gmail.com> wrote:
>
> > How would I generate a list of trending topics using solr?
> >
>
>
> By putting them in solr.
> (Generic question get at generic answer)
>
> What do you mean? Trending searches, trending data, trending documents,
> trending what?
>
>
> --
> Regards
> Tor Henning Ueland
>



-- 
Chris Dawson
971-533-8335
Human potential, travel and entrepreneurship:  http://webiphany.com/
Traveling to Portland, OR?  http://www.airbnb.com/rooms/58909

Re: Trending topics?

Posted by Tor Henning Ueland <to...@gmail.com>.
On Thu, Aug 2, 2012 at 5:34 PM, Chris Dawson <xr...@gmail.com> wrote:

> How would I generate a list of trending topics using solr?
>


By putting them in solr.
(Generic question get at generic answer)

What do you mean? Trending searches, trending data, trending documents,
trending what?


-- 
Regards
Tor Henning Ueland