You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@lucene.apache.org by "Shawn Heisey (JIRA)" <ji...@apache.org> on 2017/01/20 20:59:26 UTC

[jira] [Updated] (SOLR-10014) Log a warning when the number of fields in a core exceeds a configurable value

     [ https://issues.apache.org/jira/browse/SOLR-10014?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Shawn Heisey updated SOLR-10014:
--------------------------------
    Attachment: SOLR-10014.patch

Patch with an idea for how to implement the warning.  I can see that the initIndex method has a "firstTime" boolean, but I don't think that method has  access to the objects needed to get the field count ... so for now I'm not attempting to suppress the warning on reload.  Also, the configuration option for solrconfig.xml hasn't been worked out yet, so the threshold isn't configurable yet.  I'm pretty sure that I'm using the searcher object incorrectly, but I'm not sure how to do it correctly.

> Log a warning when the number of fields in a core exceeds a configurable value
> ------------------------------------------------------------------------------
>
>                 Key: SOLR-10014
>                 URL: https://issues.apache.org/jira/browse/SOLR-10014
>             Project: Solr
>          Issue Type: Improvement
>      Security Level: Public(Default Security Level. Issues are Public) 
>    Affects Versions: 4.10.4
>            Reporter: Shawn Heisey
>            Priority: Minor
>         Attachments: SOLR-10014.patch
>
>
> When the number of fields in an index gets extremely large, major performance problems can occur.  If the number of fields in a core exceeds a configurable number, with a default somewhere around 10000, a warning should be logged when the SolrCore is first created.  A decision needs to be made about whether to repeat the warning on core reload ... my instinct is that it should NOT be repeated, but I can see where a repeat might have some value.  Logging on reloads as well as startup would likely be easier.
> This was discovered by a Solr user who had a 420MB index with 650K documents, but their applications were abusing dynamic fields to the point where they had about 2 million unique fields in the index.  The small size of the index *should* have resulted in extremely fast commit times, but commits were taking about 10 seconds because of what Lucene had to do to handle all those fields.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org