You are viewing a plain text version of this content. The canonical link for it is here.
Posted to java-user@lucene.apache.org by "Krishnamurthy, Kannan" <Ka...@contractor.cengage.com> on 2013/08/26 22:45:36 UTC

Huge FacetArrays while using SortedSetDocValuesAccumulator

Hello, 

We are working with large lucene 4.3.0 index and using SortedSetDocValuesFacetFields for creating facets and SortedSetDocValuesAccumulator for facet accumulation. We couldn't use a taxonomy based facet implementation (We use MultiReader for searching and our indices is composed of multiple physical lucene indices, hence we cannot have a single taxonomy index). We have two million categories and expect to have another two million in the near future. As the current implementation of SortedSetDocValuesAccumulator does not support ReusingFacetArrays, we are concerned with potential garabage collector related performance issues in our high traffic application. Will future Lucene release support using ReusingFacetArrays in SortedSetDocValuesAccumulator ?

Also as an alternative we are considering subclassing FacetIndexingParams and provide dimension specific CategoryListParams during indexing time. This will help to reduce the size of the FacetArray per facet request. We realize this approach will not support multiple FacetRequest in a single SortedSetDocValuesAccumulator, as SortedSetDocValuesReaderState hardcodes the category to null while calling FacetIndexingParams.getCategoryListParams(null) in its constructor. 

Are there better approaches to this problem ?


Thanks in advance for any help. 

Kannan
Cengage Learning
---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Re: Huge FacetArrays while using SortedSetDocValuesAccumulator

Posted by Shai Erera <se...@gmail.com>.
Hi

SortedSetDocValuesAccumulator does receive FacetArrays in its ctor, so you
can pass ReusingFacetArrays. You will need to call FacetArrays.free() when
you're done with accumulation though. However, do notice that
ReusingFacetArrays did not show any big gain even with large taxonomies --
that is that the overhead of allocating and freeing them wasn't noticeable.

If you expect to use very large taxonomies, then facet partitions can help.
But for that you need to use the sidecar taxonomy index.

Shai


On Mon, Aug 26, 2013 at 11:45 PM, Krishnamurthy, Kannan <
Kannan.Krishnamurthy@contractor.cengage.com> wrote:

> Hello,
>
> We are working with large lucene 4.3.0 index and using
> SortedSetDocValuesFacetFields for creating facets and
> SortedSetDocValuesAccumulator for facet accumulation. We couldn't use a
> taxonomy based facet implementation (We use MultiReader for searching and
> our indices is composed of multiple physical lucene indices, hence we
> cannot have a single taxonomy index). We have two million categories and
> expect to have another two million in the near future. As the current
> implementation of SortedSetDocValuesAccumulator does not support
> ReusingFacetArrays, we are concerned with potential garabage collector
> related performance issues in our high traffic application. Will future
> Lucene release support using ReusingFacetArrays in
> SortedSetDocValuesAccumulator ?
>
> Also as an alternative we are considering subclassing FacetIndexingParams
> and provide dimension specific CategoryListParams during indexing time.
> This will help to reduce the size of the FacetArray per facet request. We
> realize this approach will not support multiple FacetRequest in a single
> SortedSetDocValuesAccumulator, as SortedSetDocValuesReaderState hardcodes
> the category to null while calling
> FacetIndexingParams.getCategoryListParams(null) in its constructor.
>
> Are there better approaches to this problem ?
>
>
> Thanks in advance for any help.
>
> Kannan
> Cengage Learning
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
>
>