You are viewing a plain text version of this content. The canonical link for it is here.
Posted to solr-user@lucene.apache.org by Bharadwaj <ba...@gmail.com> on 2012/08/24 21:10:55 UTC

Sort string field like integer, type cast function

Hi,

Schema:

<field name="id" type="string" indexed="true" stored="true"
multiValued="false" required="true"/>
<field name="django_id" type="string" indexed="true" stored="true"
multiValued="false" />

...

<field name="game_count" type="sint" indexed="true" stored="true"
multiValued="false" />

<field name="address_state" type="text" indexed="true" stored="true"
multiValued="false" />


I am trying to sort all records based on django_id, a string field defined
in the schema.

Query URL:
http://localhost:8970/solr/core1/select?q=*%3A*&sort=django_id+desc&start=0&rows=10

Output:
This gives me the records with django_id sorted as a string. 999, 998
placed ahead of 1000, since it is alphabetically sorted desc.

Desired result:
I want to be able to sort by a string field, but treat it as an integer.
The field : django_id

I tried the following in the sort field and a few other functions:
1.  sum(django_id, 0) desc
2.  abs(django_id) desc

These functions break because I am trying to perform a numeric operation on
a string field.
I am looking for a type cast function. Is there a way to convert the field
to an integer before sorting?
ideally:
num(django_id) desc or int(django_id) desc

I am also willing to submit a patch if this is not already present and
feasible to implement.
If type casting fails, default to -1.

Alternate solution:
I can add a new integer field to the schema and store the same data as an
integer, but prefer to have a function, if possible.

Thanks,
Bharad

Re: Sort string field like integer, type cast function

Posted by Erick Erickson <er...@gmail.com>.
Why do you need the django_id as a string in the
first place? Strings are much less efficient than
numeric types in soooo many ways that it seems
you'd be better off making it a numeric type..

But the canonical way to do what you want is to
just left-pad the string version with zeros, then
the numeric and lexical orderings are the same.

However, given how sorting works, you'd wind
up using _less_ space (I think) having an
additional int field and sorting on that than if you
tried to use string for sorting and provided a cast
capability...

FWIW,
Erick



On Fri, Aug 24, 2012 at 3:10 PM, Bharadwaj <ba...@gmail.com> wrote:
> Hi,
>
> Schema:
>
> <field name="id" type="string" indexed="true" stored="true"
> multiValued="false" required="true"/>
> <field name="django_id" type="string" indexed="true" stored="true"
> multiValued="false" />
>
> ...
>
> <field name="game_count" type="sint" indexed="true" stored="true"
> multiValued="false" />
>
> <field name="address_state" type="text" indexed="true" stored="true"
> multiValued="false" />
>
>
> I am trying to sort all records based on django_id, a string field defined
> in the schema.
>
> Query URL:
> http://localhost:8970/solr/core1/select?q=*%3A*&sort=django_id+desc&start=0&rows=10
>
> Output:
> This gives me the records with django_id sorted as a string. 999, 998
> placed ahead of 1000, since it is alphabetically sorted desc.
>
> Desired result:
> I want to be able to sort by a string field, but treat it as an integer.
> The field : django_id
>
> I tried the following in the sort field and a few other functions:
> 1.  sum(django_id, 0) desc
> 2.  abs(django_id) desc
>
> These functions break because I am trying to perform a numeric operation on
> a string field.
> I am looking for a type cast function. Is there a way to convert the field
> to an integer before sorting?
> ideally:
> num(django_id) desc or int(django_id) desc
>
> I am also willing to submit a patch if this is not already present and
> feasible to implement.
> If type casting fails, default to -1.
>
> Alternate solution:
> I can add a new integer field to the schema and store the same data as an
> integer, but prefer to have a function, if possible.
>
> Thanks,
> Bharad