You are viewing a plain text version of this content. The canonical link for it is here.
Posted to solr-user@lucene.apache.org by Fergus McMenemie <fe...@twig.me.uk> on 2009/09/30 08:21:53 UTC
Number of terms in a SOLR field
Hi all,
I am attempting to test some changes I made to my DIH based
indexing process. The changes only affect the way I
describe my fields in data-config.xml, there should be no
changes to the way the data is indexed or stored.
As a QA check I was wanting to compare the results from
indexing the same data before/after the change. I was looking
for a way of getting counts of terms in each field. I
guess Luke etc most allow this but how?
Regards Fergus.
Re: Number of terms in a SOLR field
Posted by Fergus McMenemie <fe...@twig.me.uk>.
>Fergus McMenemie wrote:
>>> Fergus McMenemie wrote:
>>>> Hi all,
>>>>
>>>> I am attempting to test some changes I made to my DIH based
>>>> indexing process. The changes only affect the way I
>>>> describe my fields in data-config.xml, there should be no
>>>> changes to the way the data is indexed or stored.
>>>>
>>>> As a QA check I was wanting to compare the results from
>>>> indexing the same data before/after the change. I was looking
>>>> for a way of getting counts of terms in each field. I
>>>> guess Luke etc most allow this but how?
>>> Luke uses brute force approach - it traverses all terms, and counts
>>> terms per field. This is easy to implement yourself - just get
>>> IndexReader.terms() enumeration and traverse it.
>>>
>> Thanks Andrzej
>>
>> This is just a one off QA check. How do I get Luke to display
>> terms and counts?
>
>1. get Luke 0.9.9
>2. open index with Luke
>3. Look at the Overview panel, you will see the list titled "Available
>fields and term counts per field".
>
>
Thanks,
That got me going, and I felt a little stupid after stumbling
across http://wiki.apache.org/solr/LukeRequestHandler
Regards Fergus
Re: Number of terms in a SOLR field
Posted by Andrzej Bialecki <ab...@getopt.org>.
Fergus McMenemie wrote:
>> Fergus McMenemie wrote:
>>> Hi all,
>>>
>>> I am attempting to test some changes I made to my DIH based
>>> indexing process. The changes only affect the way I
>>> describe my fields in data-config.xml, there should be no
>>> changes to the way the data is indexed or stored.
>>>
>>> As a QA check I was wanting to compare the results from
>>> indexing the same data before/after the change. I was looking
>>> for a way of getting counts of terms in each field. I
>>> guess Luke etc most allow this but how?
>> Luke uses brute force approach - it traverses all terms, and counts
>> terms per field. This is easy to implement yourself - just get
>> IndexReader.terms() enumeration and traverse it.
>>
> Thanks Andrzej
>
> This is just a one off QA check. How do I get Luke to display
> terms and counts?
1. get Luke 0.9.9
2. open index with Luke
3. Look at the Overview panel, you will see the list titled "Available
fields and term counts per field".
--
Best regards,
Andrzej Bialecki <><
___. ___ ___ ___ _ _ __________________________________
[__ || __|__/|__||\/| Information Retrieval, Semantic Web
___|||__|| \| || | Embedded Unix, System Integration
http://www.sigram.com Contact: info at sigram dot com
Re: Number of terms in a SOLR field
Posted by Fergus McMenemie <fe...@twig.me.uk>.
>Fergus McMenemie wrote:
>> Hi all,
>>
>> I am attempting to test some changes I made to my DIH based
>> indexing process. The changes only affect the way I
>> describe my fields in data-config.xml, there should be no
>> changes to the way the data is indexed or stored.
>>
>> As a QA check I was wanting to compare the results from
>> indexing the same data before/after the change. I was looking
>> for a way of getting counts of terms in each field. I
>> guess Luke etc most allow this but how?
>
>Luke uses brute force approach - it traverses all terms, and counts
>terms per field. This is easy to implement yourself - just get
>IndexReader.terms() enumeration and traverse it.
>
Thanks Andrzej
This is just a one off QA check. How do I get Luke to display
terms and counts?
>
>--
>Best regards,
>Andrzej Bialecki
Fergus.
--
Re: Number of terms in a SOLR field
Posted by Andrzej Bialecki <ab...@getopt.org>.
Fergus McMenemie wrote:
> Hi all,
>
> I am attempting to test some changes I made to my DIH based
> indexing process. The changes only affect the way I
> describe my fields in data-config.xml, there should be no
> changes to the way the data is indexed or stored.
>
> As a QA check I was wanting to compare the results from
> indexing the same data before/after the change. I was looking
> for a way of getting counts of terms in each field. I
> guess Luke etc most allow this but how?
Luke uses brute force approach - it traverses all terms, and counts
terms per field. This is easy to implement yourself - just get
IndexReader.terms() enumeration and traverse it.
--
Best regards,
Andrzej Bialecki <><
___. ___ ___ ___ _ _ __________________________________
[__ || __|__/|__||\/| Information Retrieval, Semantic Web
___|||__|| \| || | Embedded Unix, System Integration
http://www.sigram.com Contact: info at sigram dot com