You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@lucene.apache.org by Erick Erickson <er...@gmail.com> on 2014/05/07 17:10:37 UTC

Reading many field definitions from the fnm file very slow

Below is from the user's list. Certainly this is a pathological case;
2.6M fields (auto-generated I presume) is a bit out of the norm ;).

Does this point to a need to pre-allocate space to the hashmap? Don't
quite know whether this stack trace just happens to catch the hashmap
being resized or whether we're really spending all the time in
resizing it....

The original question was whether it was possible to prune the fnm
file when merging (OK, I've reworded it a bit.) How valid is the
poster's point? They went from 2.6M fields to 2 fields and still are
getting 11 second faceting times on, I think, a 1 document index.

I don't know the guts of the merge code, but offhand I can see trying
to purge the field descriptions on merge would be harder that it might
seem at first blush, and this is not a normal use case that we need to
support.

I thought I'd pass it along, but personally I think "nuke your index
and re-index" is a reasonable response.

FWIW...

**************************

For our setup, the file size is 123M. Internal it has 2.6M fields.

The problem is facet operation. It take a while for facet.
we are stuck in below call stack for 11 second.

java.util.HashMap.transfer(Unknown Source)
java.util.HashMap.resize(Unknown Source)
java.util.HashMap.addEntry(Unknown Source)
java.util.HashMap.put(Unknown Source)
org.apache.lucene.index.FieldInfos$Builder.addOrUpdateInternal(FieldInfos.java:285)
org.apache.lucene.index.FieldInfos$Builder.add(FieldInfos.java:302)
org.apache.lucene.index.FieldInfos$Builder.add(FieldInfos.java:251)
org.apache.lucene.index.MultiFields.getMergedFieldInfos(MultiFields.java:276)
org.apache.lucene.index.SlowCompositeReaderWrapper.getFieldInfos(SlowCompositeReaderWrapper.java:220)
org.apache.lucene.search.FieldCacheImpl.getTermsIndex(FieldCacheImpl.java:1116)
org.apache.lucene.search.FieldCacheImpl.getTermsIndex(FieldCacheImpl.java:1106)
org.apache.solr.request.SimpleFacets.getFieldCacheCounts(SimpleFacets.java:574)
org.apache.solr.request.SimpleFacets.getTermCounts(SimpleFacets.java:429)
org.apache.solr.request.SimpleFacets.getFacetFieldCounts(SimpleFacets.java:517)
org.apache.solr.request.SimpleFacets.getFacetCounts(SimpleFacets.java:252)
org.apache.solr.handler.component.FacetComponent.process(FacetComponent.java:78)

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org