You are viewing a plain text version of this content. The canonical link for it is here.
Posted to oak-issues@jackrabbit.apache.org by "Thomas Mueller (Jira)" <ji...@apache.org> on 2022/05/24 15:37:00 UTC

[jira] [Commented] (OAK-9781) Lucene Index MBean getFieldTerms Excludes Results for Unique Fields

    [ https://issues.apache.org/jira/browse/OAK-9781?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17541580#comment-17541580 ] 

Thomas Mueller commented on OAK-9781:
-------------------------------------

The code says "> 1" 
https://github.com/apache/jackrabbit-oak/blame/bd4b690561fb6456ed9f42beefd47f93919917f2/oak-lucene/src/main/java/org/apache/jackrabbit/oak/plugins/index/lucene/LuceneIndexMBeanImpl.java#L506

I don't remember the reason for having this condition... maybe it was because without the condition, too many entries were added easily? If we change it, maybe adding a parameter would be good (min count) - that might mean we have to add one more method... But I'm not sure - it would need to be tested if it's really necessary.


> Lucene Index MBean getFieldTerms Excludes Results for Unique Fields
> -------------------------------------------------------------------
>
>                 Key: OAK-9781
>                 URL: https://issues.apache.org/jira/browse/OAK-9781
>             Project: Jackrabbit Oak
>          Issue Type: Bug
>          Components: indexing
>    Affects Versions: 1.8.0
>            Reporter: Dan Klco
>            Priority: Minor
>             Fix For: 1.44.0
>
>
> The getFieldTerms method in the Lucene Index MBean only includes terms with < 1 documents. This means that terms with unique or very well distributed values such as UUIDs, paths or even file sizes will return few or no results from this method. 
> Instead, this should only exclude terms where there are no associated documents.



--
This message was sent by Atlassian Jira
(v8.20.7#820007)