You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@lucene.apache.org by "Rob Audenaerde (JIRA)" <ji...@apache.org> on 2014/03/21 12:41:43 UTC
[jira] [Comment Edited] (LUCENE-5370) Sorting Facets on
CategoryPath (Label)
[ https://issues.apache.org/jira/browse/LUCENE-5370?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13942986#comment-13942986 ]
Rob Audenaerde edited comment on LUCENE-5370 at 3/21/14 11:40 AM:
------------------------------------------------------------------
I currently use the code below:
{code}
private FacetResult getTopValueChildren( int topN, final SortDir sortDir, String dim, String... path ) throws IOException
{
if ( topN <= 0 )
{
throw new IllegalArgumentException( "topN must be > 0 (got: " + topN + ")" );
}
DimConfig dimConfig = this.verifyDim( dim );
FacetLabel cp = new FacetLabel( dim, path );
int dimOrd = this.taxoReader.getOrdinal( cp );
if ( dimOrd == -1 )
{
return null;
}
TopOrdAndLabelQueue q = new TopOrdAndLabelQueue( Math.min( this.taxoReader.getSize(), topN ) )
{
@Override
protected boolean lessThan( OrdAndLabel a, OrdAndLabel b )
{
if ( sortDir == SortDir.DESC )
{
return super.lessThan( a, b );
}
else
{
return !super.lessThan( a, b );
}
}
};
int ord = this.children[dimOrd];
int totValue = 0;
int childCount = 0;
TopOrdAndLabelQueue.OrdAndLabel reuse = null;
while ( ord != TaxonomyReader.INVALID_ORDINAL )
{
if ( this.values[ord] > 0 )
{
totValue += this.values[ord];
childCount++;
if ( reuse == null )
{
reuse = new TopOrdAndLabelQueue.OrdAndLabel();
}
reuse.ord = ord;
reuse.value = this.values[ord];
reuse.label = this.taxoReader.getPath( ord ).components[cp.length];
reuse = q.insertWithOverflow( reuse );
}
ord = this.siblings[ord];
}
if ( totValue == 0 )
{
return null;
}
if ( dimConfig.multiValued )
{
if ( dimConfig.requireDimCount )
{
totValue = this.values[dimOrd];
}
else
{
// Our sum'd value is not correct, in general:
totValue = -1;
}
}
else
{
// Our sum'd dim value is accurate, so we keep it
}
LabelAndValue[] labelValues = new LabelAndValue[q.size()];
for ( int i = labelValues.length - 1; i >= 0; i-- )
{
TopOrdAndLabelQueue.OrdAndLabel ordAndValue = q.pop();
labelValues[i] = new LabelAndValue( ordAndValue.label, ordAndValue.value );
}
return new FacetResult( dim, path, totValue, labelValues, childCount );
}
{code}
I use the same approach as sorting on counts, except that I sort on the label instead. It costs some in terms of retrieving the labels from the taxonomyreader.
So I ignore the counts in terms of sorting; but I do use them because the user is interesed in the counts fo the sorted facet labels.
Btw. I'm currently experimenting with a similar approach where we have facetlabels that are effectively numbers (like currency). Because I do not know on beforehand what will be in the facets, I put the String representation in the FacetLabel and store the numberic value in the Float part of a FloatAssociatedFacetField. Facets then can be sorted on the FloatAssociated value, which should be faster than retrieving labels from the reader.
was (Author: robau):
I currently use the code below:
{code}
private FacetResult getTopValueChildren( int topN, final SortDir sortDir, String dim, String... path ) throws IOException
{
if ( topN <= 0 )
{
throw new IllegalArgumentException( "topN must be > 0 (got: " + topN + ")" );
}
DimConfig dimConfig = this.verifyDim( dim );
FacetLabel cp = new FacetLabel( dim, path );
int dimOrd = this.taxoReader.getOrdinal( cp );
if ( dimOrd == -1 )
{
return null;
}
TopOrdAndLabelQueue q = new TopOrdAndLabelQueue( Math.min( this.taxoReader.getSize(), topN ) )
{
@Override
protected boolean lessThan( OrdAndLabel a, OrdAndLabel b )
{
if ( sortDir == SortDir.DESC )
{
return super.lessThan( a, b );
}
else
{
return !super.lessThan( a, b );
}
}
};
int ord = this.children[dimOrd];
int totValue = 0;
int childCount = 0;
TopOrdAndLabelQueue.OrdAndLabel reuse = null;
while ( ord != TaxonomyReader.INVALID_ORDINAL )
{
if ( this.values[ord] > 0 )
{
totValue += this.values[ord];
childCount++;
if ( reuse == null )
{
reuse = new TopOrdAndLabelQueue.OrdAndLabel();
}
reuse.ord = ord;
reuse.value = this.values[ord];
reuse.label = this.taxoReader.getPath( ord ).components[cp.length];
reuse = q.insertWithOverflow( reuse );
}
ord = this.siblings[ord];
}
if ( totValue == 0 )
{
return null;
}
if ( dimConfig.multiValued )
{
if ( dimConfig.requireDimCount )
{
totValue = this.values[dimOrd];
}
else
{
// Our sum'd value is not correct, in general:
totValue = -1;
}
}
else
{
// Our sum'd dim value is accurate, so we keep it
}
LabelAndValue[] labelValues = new LabelAndValue[q.size()];
for ( int i = labelValues.length - 1; i >= 0; i-- )
{
TopOrdAndLabelQueue.OrdAndLabel ordAndValue = q.pop();
labelValues[i] = new LabelAndValue( ordAndValue.label, ordAndValue.value );
}
return new FacetResult( dim, path, totValue, labelValues, childCount );
}
{code}
> Sorting Facets on CategoryPath (Label)
> --------------------------------------
>
> Key: LUCENE-5370
> URL: https://issues.apache.org/jira/browse/LUCENE-5370
> Project: Lucene - Core
> Issue Type: New Feature
> Components: modules/facet
> Affects Versions: 4.6
> Reporter: Rob Audenaerde
> Labels: features
>
> Facet support sorting through {{FacetRequest.SortOrder}}. This is used in the {{ResultSortUtils}}. For my application it would be very nice if the facets can also be sorted on their label.
> I think this could be accomplished by altering {{FacetRequest}} with an extra enum {{SortType}}, and two extra {{Heap}} in {{ResultSortUtils}} which instead of comparing the double value, compare the CategoryPath.
> What do you think of this idea? Or could the same behaviour be accomplished in a different way already?
> (btw: I tried building this patch on the trunk of lucene5.0; but I couldn't get the maven build to build correctly. I will try again lateron on the 4.6 branch.
--
This message was sent by Atlassian JIRA
(v6.2#6252)
---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org