You are viewing a plain text version of this content. The canonical link for it is here.
Posted to solr-user@lucene.apache.org by "Thung, Peter C CIV SPAWARSYSCEN-PACIFIC, 56340" <pe...@navy.mil> on 2009/09/28 22:41:02 UTC

Question on Access or viewing TermFrequency Vector via SOLR.

is there a SOLR query that can access or view the TermFrequencies for
the various documents
discovered, Or is the only wya to programmatically access this
information.
If so could someon share an example and maybe a link for information on
how to do this?
Some sample queries?
 
Thank you in advance.
 

-Peter

 


RE: Question on Access or viewing TermFrequency Vector via SOLR.

Posted by "Thung, Peter C CIV SPAWARSYSCEN-PACIFIC, 56340" <pe...@navy.mil>.
Grant,

Thanks for the link. Based on the example, I think this is what I need.
If effeciency is a problem, I will consider it. I see the note that
tv.df can be expensive.
I guess it all depends on how big the collection is.

I'm a proponent of not reinvientin the wheel if it has already been
invented
And can be easily integrated into my task.

I looked at the TermVecotrComponentExampleEnabled (Example output) and
It looks like it is what I needed.

-Peter


 


> -----Original Message-----
> From: Grant Ingersoll [mailto:gsingers@apache.org] 
> Sent: Monday, September 28, 2009 6:17 PM
> To: solr-user@lucene.apache.org
> Subject: Re: Question on Access or viewing TermFrequency 
> Vector via SOLR.
> 
> 
> http://wiki.apache.org/solr/TermVectorComponent.  You may 
> want to hack  
> in your own capabilities to implement your own TermVectorMapper for  
> efficiency reasons.
> 
> On Sep 28, 2009, at 5:05 PM, Thung, Peter C CIV 
> SPAWARSYSCEN-PACIFIC,  
> 56340 wrote:
> 
> > Mark,
> >
> > Thanks.  I think this may be partially what I need.
> >
> > Basically, what I'm trying to figure out is the following
> > If someone enters a keyword say
> > Apple.
> > I would like to find all the documents that have the word apple In 
> > them, and then for each document, the number of times it showed
> > up in
> > each
> > Document.
> >
> > From the link you sent, (assuming I understand it 
> correctly), With the 
> > field name "name", it has the terms (values) within the field name 
> > "name" Of 1, 11, 120, 133, 184, etc.. With the respective counts of 
> > how many documents that match the term. (I have to wonder if it 
> > multiply counts documents if the term is in a document more 
> than once.
> >
> > It does not tell me which document matched a specific term, or the 
> > number of terms that are in a specific document, correct?
> >
> >
> > -Peter
> >
> >
> >
> > ******************************************************************
> > Peter Thung
> > Software Developer
> > IBS Project Technical Lead -Web Developer
> >
> > Code 56340  - Net-centric ISR Development Branch
> > Joint & National ISR Systems Division
> > Inteligence, Surveillance and Reconnaissance Department
> > US Navy Space & Naval Warfare Systems Center Pacific (SSC 
> PAC) Topside 
> > Campus, Bldg A33, room 0055 53560 Hull Street, San Diego, CA 92152
> >
> > UNCLASS Email: peter.thung@navy.mil
> > SIPRNET Email: thungp@spawar.navy.smil.mil
> > COMM (Primary): (619) 553-6513
> > COMM (Secondary):(619) 553-0777
> > FAX: (619) 553-1586
> > ******************************************************************
> >
> >
> >
> >> -----Original Message-----
> >> From: Mark Miller [mailto:markrmiller@gmail.com]
> >> Sent: Monday, September 28, 2009 1:50 PM
> >> To: solr-user@lucene.apache.org
> >> Subject: Re: Question on Access or viewing TermFrequency 
> Vector via 
> >> SOLR.
> >>
> >>
> >> Thung, Peter C CIV SPAWARSYSCEN-PACIFIC, 56340 wrote:
> >>> is there a SOLR query that can access or view the
> >> TermFrequencies for
> >>> the various documents discovered, Or is the only wya to 
> >>> programmatically access this information. If so could 
> someon share 
> >>> an example and maybe a link for
> >> information on
> >>> how to do this?
> >>> Some sample queries?
> >>>
> >>> Thank you in advance.
> >>>
> >>>
> >>> -Peter
> >>>
> >>>
> >>>
> >>>
> >>>
> >> Close I can think of is: http://wiki.apache.org/solr/TermsComponent
> >>
> >> --
> >> - Mark
> >>
> >> http://www.lucidimagination.com
> >>
> >>
> >>
> >>
> 
> --------------------------
> Grant Ingersoll
> http://www.lucidimagination.com/
> 
> Search the Lucene ecosystem (Lucene/Solr/Nutch/Mahout/Tika/Droids)  
> using Solr/Lucene:
> http://www.lucidimagination.com/search
> 
> 

Re: Question on Access or viewing TermFrequency Vector via SOLR.

Posted by Grant Ingersoll <gs...@apache.org>.
http://wiki.apache.org/solr/TermVectorComponent.  You may want to hack  
in your own capabilities to implement your own TermVectorMapper for  
efficiency reasons.

On Sep 28, 2009, at 5:05 PM, Thung, Peter C CIV SPAWARSYSCEN-PACIFIC,  
56340 wrote:

> Mark,
>
> Thanks.  I think this may be partially what I need.
>
> Basically, what I'm trying to figure out is the following
> If someone enters a keyword say
> Apple.
> I would like to find all the documents that have the word apple
> In them, and then for each document, the number of times it showed  
> up in
> each
> Document.
>
> From the link you sent, (assuming I understand it correctly),
> With the field name "name", it has the terms (values) within the field
> name "name"
> Of 1, 11, 120, 133, 184, etc.. With the respective counts of how many
> documents that match the term. (I have to wonder if it multiply counts
> documents if the term is in a document more than once.
>
> It does not tell me which document matched a specific term, or the
> number of terms that are in a specific document, correct?
>
>
> -Peter
>
>
>
> ******************************************************************
> Peter Thung
> Software Developer
> IBS Project Technical Lead -Web Developer
>
> Code 56340  - Net-centric ISR Development Branch
> Joint & National ISR Systems Division
> Inteligence, Surveillance and Reconnaissance Department
> US Navy Space & Naval Warfare Systems Center Pacific (SSC PAC)
> Topside Campus, Bldg A33, room 0055
> 53560 Hull Street, San Diego, CA 92152
>
> UNCLASS Email: peter.thung@navy.mil
> SIPRNET Email: thungp@spawar.navy.smil.mil
> COMM (Primary): (619) 553-6513
> COMM (Secondary):(619) 553-0777
> FAX: (619) 553-1586
> ******************************************************************
>
>
>
>> -----Original Message-----
>> From: Mark Miller [mailto:markrmiller@gmail.com]
>> Sent: Monday, September 28, 2009 1:50 PM
>> To: solr-user@lucene.apache.org
>> Subject: Re: Question on Access or viewing TermFrequency
>> Vector via SOLR.
>>
>>
>> Thung, Peter C CIV SPAWARSYSCEN-PACIFIC, 56340 wrote:
>>> is there a SOLR query that can access or view the
>> TermFrequencies for
>>> the various documents discovered, Or is the only wya to
>>> programmatically access this information.
>>> If so could someon share an example and maybe a link for
>> information on
>>> how to do this?
>>> Some sample queries?
>>>
>>> Thank you in advance.
>>>
>>>
>>> -Peter
>>>
>>>
>>>
>>>
>>>
>> Close I can think of is: http://wiki.apache.org/solr/TermsComponent
>>
>> -- 
>> - Mark
>>
>> http://www.lucidimagination.com
>>
>>
>>
>>

--------------------------
Grant Ingersoll
http://www.lucidimagination.com/

Search the Lucene ecosystem (Lucene/Solr/Nutch/Mahout/Tika/Droids)  
using Solr/Lucene:
http://www.lucidimagination.com/search


RE: Question on Access or viewing TermFrequency Vector via SOLR.

Posted by "Thung, Peter C CIV SPAWARSYSCEN-PACIFIC, 56340" <pe...@navy.mil>.
Mark,

Thanks.  I think this may be partially what I need.

Basically, what I'm trying to figure out is the following
If someone enters a keyword say
Apple.
I would like to find all the documents that have the word apple
In them, and then for each document, the number of times it showed up in
each 
Document.

>From the link you sent, (assuming I understand it correctly),
With the field name "name", it has the terms (values) within the field
name "name"
Of 1, 11, 120, 133, 184, etc.. With the respective counts of how many
documents that match the term. (I have to wonder if it multiply counts
documents if the term is in a document more than once.

It does not tell me which document matched a specific term, or the
number of terms that are in a specific document, correct?


-Peter



******************************************************************
Peter Thung
Software Developer
IBS Project Technical Lead -Web Developer
 
Code 56340  - Net-centric ISR Development Branch
Joint & National ISR Systems Division
Inteligence, Surveillance and Reconnaissance Department
US Navy Space & Naval Warfare Systems Center Pacific (SSC PAC)
Topside Campus, Bldg A33, room 0055
53560 Hull Street, San Diego, CA 92152
 
UNCLASS Email: peter.thung@navy.mil
SIPRNET Email: thungp@spawar.navy.smil.mil
COMM (Primary): (619) 553-6513
COMM (Secondary):(619) 553-0777
FAX: (619) 553-1586
******************************************************************
 


> -----Original Message-----
> From: Mark Miller [mailto:markrmiller@gmail.com] 
> Sent: Monday, September 28, 2009 1:50 PM
> To: solr-user@lucene.apache.org
> Subject: Re: Question on Access or viewing TermFrequency 
> Vector via SOLR.
> 
> 
> Thung, Peter C CIV SPAWARSYSCEN-PACIFIC, 56340 wrote:
> > is there a SOLR query that can access or view the 
> TermFrequencies for 
> > the various documents discovered, Or is the only wya to 
> > programmatically access this information.
> > If so could someon share an example and maybe a link for 
> information on
> > how to do this?
> > Some sample queries?
> >  
> > Thank you in advance.
> >  
> >
> > -Peter
> >
> >  
> >
> >
> >   
> Close I can think of is: http://wiki.apache.org/solr/TermsComponent
> 
> -- 
> - Mark
> 
> http://www.lucidimagination.com
> 
> 
> 
> 

Re: Question on Access or viewing TermFrequency Vector via SOLR.

Posted by Mark Miller <ma...@gmail.com>.
Thung, Peter C CIV SPAWARSYSCEN-PACIFIC, 56340 wrote:
> is there a SOLR query that can access or view the TermFrequencies for
> the various documents
> discovered, Or is the only wya to programmatically access this
> information.
> If so could someon share an example and maybe a link for information on
> how to do this?
> Some sample queries?
>  
> Thank you in advance.
>  
>
> -Peter
>
>  
>
>
>   
Close I can think of is: http://wiki.apache.org/solr/TermsComponent

-- 
- Mark

http://www.lucidimagination.com