You are viewing a plain text version of this content. The canonical link for it is here.
Posted to solr-user@lucene.apache.org by Alexandre Rafalovitch <ar...@gmail.com> on 2013/04/14 06:29:30 UTC
Is any way to return the number of indexed tokens in a field?
Hello,
We seem to have all sorts of functions around tokenized field content, but
I am looking for simple count/length that can be returned as a
pseudo-field. Does anyone know of one out of the box?
The specific situation is that I am indexing a field for specific regular
expressions that become tokens (in a copyField). Not every field has the
same number of those.
I now want to find the documents that have maximum number of tokens in that
field (for testing and review). But I can't figure out how. Any help would
be appreciated.
Regards,
Alex.
Personal blog: http://blog.outerthoughts.com/
LinkedIn: http://www.linkedin.com/in/alexandrerafalovitch
- Time is the quality of nature that keeps events from happening all at
once. Lately, it doesn't seem to be working. (Anonymous - via GTD book)
Re: Is any way to return the number of indexed tokens in a field?
Posted by Mikhail Khludnev <mk...@griddynamics.com>.
Alex,
It's not what do you need to count, pre-analyzed values or tokens as an
analysis result.
if former, I suggest you to look into something like
FieldLengthUpdateProcessorFactory, in case of later you need to override
Similarity.computeNorm(String, FieldInvertState) / encode/decodeNorm.
On Sun, Apr 14, 2013 at 8:29 AM, Alexandre Rafalovitch
<ar...@gmail.com>wrote:
> Hello,
>
> We seem to have all sorts of functions around tokenized field content, but
> I am looking for simple count/length that can be returned as a
> pseudo-field. Does anyone know of one out of the box?
>
> The specific situation is that I am indexing a field for specific regular
> expressions that become tokens (in a copyField). Not every field has the
> same number of those.
>
> I now want to find the documents that have maximum number of tokens in that
> field (for testing and review). But I can't figure out how. Any help would
> be appreciated.
>
> Regards,
> Alex.
> Personal blog: http://blog.outerthoughts.com/
> LinkedIn: http://www.linkedin.com/in/alexandrerafalovitch
> - Time is the quality of nature that keeps events from happening all at
> once. Lately, it doesn't seem to be working. (Anonymous - via GTD book)
>
--
Sincerely yours
Mikhail Khludnev
Principal Engineer,
Grid Dynamics
<http://www.griddynamics.com>
<mk...@griddynamics.com>