You are viewing a plain text version of this content. The canonical link for it is here.

Posted to dev@lucene.apache.org by "Erick Erickson (Commented) (JIRA)" <ji...@apache.org> on 2011/12/28 21:22:32 UTC

[jira] [Commented] (SOLR-1931) Schema Browser does not scale with large indexes

    [ https://issues.apache.org/jira/browse/SOLR-1931?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13176796#comment-13176796 ] 

Erick Erickson commented on SOLR-1931:
--------------------------------------

In the trunk (4.x) version, (from Muir) below. I haven't looked at this yet, but being able to get some approximation back from Luke quickly would be a big help. Maybe we can make this happen on trunk?

The use-case I'm interested in is the one in which we're really only looking for outrageous numbers of unique terms. Having unique terms per segment would go a long way towards that use-case.

*******
Is it really necessary to see the 'top level' number of distinct terms
summed across all segments?
Maybe its good enough to list the information on a per-segment basis.
Then it would always be instant-fast:

you would just use FieldsEnum api to list all the fields, and for each
field call .terms() and then Terms.getUniqueTermCount()

Note: getUniqueTermCount won't work (returns -1) for any 3.x segments
that haven't yet been upgraded to the 4.0 format.
The old 3.x format only allows you to get the uniqueTermCount across
all fields in the segment (Fields.getUniqueTermCount), because fields
are not clearly separated.
******** 
                
> Schema Browser does not scale with large indexes
> ------------------------------------------------
>
>                 Key: SOLR-1931
>                 URL: https://issues.apache.org/jira/browse/SOLR-1931
>             Project: Solr
>          Issue Type: Improvement
>          Components: web gui
>    Affects Versions: 1.4
>            Reporter: Lance Norskog
>            Priority: Minor
>
> The Schema  Browser JSP by default causes the Luke handler to "scan the world". In large indexes this make the UI useless.
> On an index with 64m documents & 8gb of disk space, the Schema Browser took 6 minutes to open and hogged all disk I/O, making Solr useless.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org