You are viewing a plain text version of this content. The canonical link for it is here.
Posted to solr-user@lucene.apache.org by André Widhani <an...@digicol.com> on 2019/10/22 14:11:59 UTC

Inaccuracies sorting by payload value

I have a problem when sorting by payload value ... the resulting sort order
is correct for some documents, but not all.

The field type and field definitions are as follows:

<fieldType name="delimited_payloads_int" stored="false" indexed="true"
class="solr.TextField">
  <analyzer>
    <tokenizer class="solr.WhitespaceTokenizerFactory"/>
    <filter class="solr.DelimitedPayloadTokenFilterFactory"
encoder="integer" delimiter="|"/>
  </analyzer>
</fieldType>

<field name="_utag_order" type="delimited_payloads_int" indexed="true"
stored="true" multiValued="true"/>

The request parameters are the following:

<lst name="params">
    <str name="q">_exact_utag_primary_id:utag77n5840c6h5v0g9b9ww</str>
    <str name="fl">_id,_utag_order,sort_val:${COMPUTED_SORT_VAL}</str>
    <str name="COMPUTED_SORT_VAL">payload(_utag_order,$COLL_ID)</str>
    <str name="sort">${COMPUTED_SORT_VAL} asc</str>
    <str name="rows">999</str>
    <str name="wt">xml</str>
    <str name="COLL_ID">utag77n5840c6h5v0g9b9ww</str>
</lst>

And here is an excerpt of some of the documents in the result that are not
sorted correctly:

<doc>
    <str name="_id">doc77n53r5bag9e3075ikm</str>
    <arr name="_utag_order">
        <str>utag77n53dda1mf1cda749s1|1571733614</str>
        <str>utag77n5840c6h5v0g9b9ww|1571734246</str>
    </arr>
    <float name="sort_val">1.57173427E9</float>
</doc>
<doc>
    <str name="_id">doc77n52cnj78nmksuhikl</str>
    <arr name="_utag_order">
        <str>utag77n520zikegu6yjkfhn|1571733431</str>
        <str>utag77n5840c6h5v0g9b9ww|1571734248</str>
    </arr>
    <float name="sort_val">1.57173427E9</float>
</doc>
<doc>
    <str name="_id">doc77n52c05tevwo08hikk</str>
    <arr name="_utag_order">
        <str>utag77n520zikegu6yjkfhn|1571733428</str>
        <str>utag77n5840c6h5v0g9b9ww|1571734247</str>
    </arr>
    <float name="sort_val">1.57173427E9</float>
</doc>

Please check the payload value (number after the dash) in the second item
under "_utag_order": the order returned is 1571734246, 1571734248,
1571734247, so obviously wrong.

It looks like the fact the payload function always returns a float results
in a loss of precision the encoded integer payload values.

Is this a known issue? Is there a work-around to correctly sort by the
value that has been encoded?

Thanks for reading,
André