You are viewing a plain text version of this content. The canonical link for it is here.
Posted to oak-issues@jackrabbit.apache.org by "Thomas Mueller (Jira)" <ji...@apache.org> on 2022/05/24 15:37:00 UTC
[jira] [Commented] (OAK-9781) Lucene Index MBean getFieldTerms Excludes Results for Unique Fields
[ https://issues.apache.org/jira/browse/OAK-9781?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17541580#comment-17541580 ]
Thomas Mueller commented on OAK-9781:
-------------------------------------
The code says "> 1"
https://github.com/apache/jackrabbit-oak/blame/bd4b690561fb6456ed9f42beefd47f93919917f2/oak-lucene/src/main/java/org/apache/jackrabbit/oak/plugins/index/lucene/LuceneIndexMBeanImpl.java#L506
I don't remember the reason for having this condition... maybe it was because without the condition, too many entries were added easily? If we change it, maybe adding a parameter would be good (min count) - that might mean we have to add one more method... But I'm not sure - it would need to be tested if it's really necessary.
> Lucene Index MBean getFieldTerms Excludes Results for Unique Fields
> -------------------------------------------------------------------
>
> Key: OAK-9781
> URL: https://issues.apache.org/jira/browse/OAK-9781
> Project: Jackrabbit Oak
> Issue Type: Bug
> Components: indexing
> Affects Versions: 1.8.0
> Reporter: Dan Klco
> Priority: Minor
> Fix For: 1.44.0
>
>
> The getFieldTerms method in the Lucene Index MBean only includes terms with < 1 documents. This means that terms with unique or very well distributed values such as UUIDs, paths or even file sizes will return few or no results from this method.
> Instead, this should only exclude terms where there are no associated documents.
--
This message was sent by Atlassian Jira
(v8.20.7#820007)