You are viewing a plain text version of this content. The canonical link for it is here.
Posted to solr-dev@lucene.apache.org by "Erik Hatcher (JIRA)" <ji...@apache.org> on 2009/09/30 18:07:23 UTC

[jira] Created: (SOLR-1478) Enable sort by docid

Enable sort by docid
--------------------

                 Key: SOLR-1478
                 URL: https://issues.apache.org/jira/browse/SOLR-1478
             Project: Solr
          Issue Type: New Feature
          Components: search
            Reporter: Erik Hatcher
            Priority: Minor
             Fix For: 1.4


Lucene allows sorting by docid, but Solr currently does not provide a way to specify it. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (SOLR-1478) Enable sort by docid

Posted by "Shalin Shekhar Mangar (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/SOLR-1478?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12761584#action_12761584 ] 

Shalin Shekhar Mangar commented on SOLR-1478:
---------------------------------------------

I don't like having an arbitrary character like '#' signifying a sort type because it does not explain itself to a user. Once 1.4 goes out, it will be public API and we won't be able to change this easily. Erik, please consider this again.

This also does not work with distributed search which should be clearly noted wherever we decide to document this. ShardDoc.java line 170 says that it is possible to support it but I'm not sure what Yonik had in mind.

> Enable sort by docid
> --------------------
>
>                 Key: SOLR-1478
>                 URL: https://issues.apache.org/jira/browse/SOLR-1478
>             Project: Solr
>          Issue Type: New Feature
>          Components: search
>            Reporter: Erik Hatcher
>            Priority: Minor
>             Fix For: 1.4
>
>         Attachments: SOLR-1478.patch
>
>
> Lucene allows sorting by docid, but Solr currently does not provide a way to specify it. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (SOLR-1478) Enable sort by docid

Posted by "Grant Ingersoll (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/SOLR-1478?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12761216#action_12761216 ] 

Grant Ingersoll commented on SOLR-1478:
---------------------------------------

Sounds good.

> Enable sort by docid
> --------------------
>
>                 Key: SOLR-1478
>                 URL: https://issues.apache.org/jira/browse/SOLR-1478
>             Project: Solr
>          Issue Type: New Feature
>          Components: search
>            Reporter: Erik Hatcher
>            Priority: Minor
>             Fix For: 1.4
>
>         Attachments: SOLR-1478.patch
>
>
> Lucene allows sorting by docid, but Solr currently does not provide a way to specify it. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (SOLR-1478) Enable sort by docid

Posted by "Yonik Seeley (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/SOLR-1478?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12761583#action_12761583 ] 

Yonik Seeley commented on SOLR-1478:
------------------------------------

Does this work with distributed search?

> Enable sort by docid
> --------------------
>
>                 Key: SOLR-1478
>                 URL: https://issues.apache.org/jira/browse/SOLR-1478
>             Project: Solr
>          Issue Type: New Feature
>          Components: search
>            Reporter: Erik Hatcher
>            Priority: Minor
>             Fix For: 1.4
>
>         Attachments: SOLR-1478.patch
>
>
> Lucene allows sorting by docid, but Solr currently does not provide a way to specify it. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (SOLR-1478) Enable sort by docid

Posted by "Shalin Shekhar Mangar (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/SOLR-1478?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12761585#action_12761585 ] 

Shalin Shekhar Mangar commented on SOLR-1478:
---------------------------------------------

bq. Does this work with distributed search?

No, it throws an exception:
{code}
SEVERE: java.lang.RuntimeException: Doc sort not supported
	at org.apache.solr.handler.component.ShardFieldSortedHitQueue.getCachedComparator(ShardDoc.java:171)
	at org.apache.solr.handler.component.ShardFieldSortedHitQueue.<init>(ShardDoc.java:96)
	at org.apache.solr.handler.component.QueryComponent.mergeIds(QueryComponent.java:393)
	at org.apache.solr.handler.component.QueryComponent.handleResponses(QueryComponent.java:298)
	at org.apache.solr.handler.component.SearchHandler.handleRequestBody(SearchHandler.java:290)
{code}

> Enable sort by docid
> --------------------
>
>                 Key: SOLR-1478
>                 URL: https://issues.apache.org/jira/browse/SOLR-1478
>             Project: Solr
>          Issue Type: New Feature
>          Components: search
>            Reporter: Erik Hatcher
>            Priority: Minor
>             Fix For: 1.4
>
>         Attachments: SOLR-1478.patch
>
>
> Lucene allows sorting by docid, but Solr currently does not provide a way to specify it. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (SOLR-1478) Enable sort by docid

Posted by "Erik Hatcher (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/SOLR-1478?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12761518#action_12761518 ] 

Erik Hatcher commented on SOLR-1478:
------------------------------------

I committed, and left the special "field" as "#".  I'd rather avoid a string that could potentially be a field name in use, and sorting by docid will be such a specialized case that the encoding confusion won't be too bad.  Folks have to deal with URL encoding everywhere anyway.  I kinda like that character to mean "number".

> Enable sort by docid
> --------------------
>
>                 Key: SOLR-1478
>                 URL: https://issues.apache.org/jira/browse/SOLR-1478
>             Project: Solr
>          Issue Type: New Feature
>          Components: search
>            Reporter: Erik Hatcher
>            Priority: Minor
>             Fix For: 1.4
>
>         Attachments: SOLR-1478.patch
>
>
> Lucene allows sorting by docid, but Solr currently does not provide a way to specify it. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (SOLR-1478) Enable sort by docid

Posted by "Steven Rowe (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/SOLR-1478?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12761699#action_12761699 ] 

Steven Rowe commented on SOLR-1478:
-----------------------------------

Providing aliases would allow all parties to get what they want.  Downside: maintenance/documentation issues with multiple syntaxes (minor IMHO).  Upside: collision probability goes down even further.

> Enable sort by docid
> --------------------
>
>                 Key: SOLR-1478
>                 URL: https://issues.apache.org/jira/browse/SOLR-1478
>             Project: Solr
>          Issue Type: New Feature
>          Components: search
>            Reporter: Erik Hatcher
>            Priority: Minor
>             Fix For: 1.4
>
>         Attachments: SOLR-1478.patch
>
>
> Lucene allows sorting by docid, but Solr currently does not provide a way to specify it. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (SOLR-1478) Enable sort by docid

Posted by "Erik Hatcher (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/SOLR-1478?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12761188#action_12761188 ] 

Erik Hatcher commented on SOLR-1478:
------------------------------------

Only the LukeRequestHandler, that I can tell, allows fetching a document by docid and returns it in the response too.

I don't see a need to return the docid even if one is sorting by it.  Sorting by docid allows for last-in-first-out, or first-in-first-out, sorting without any caching overhead of sorting by a field.

> Enable sort by docid
> --------------------
>
>                 Key: SOLR-1478
>                 URL: https://issues.apache.org/jira/browse/SOLR-1478
>             Project: Solr
>          Issue Type: New Feature
>          Components: search
>            Reporter: Erik Hatcher
>            Priority: Minor
>             Fix For: 1.4
>
>         Attachments: SOLR-1478.patch
>
>
> Lucene allows sorting by docid, but Solr currently does not provide a way to specify it. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (SOLR-1478) Enable sort by docid

Posted by "Yonik Seeley (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/SOLR-1478?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12761725#action_12761725 ] 

Yonik Seeley commented on SOLR-1478:
------------------------------------

A Lucene field name can be anything... so '#' could also be a collision.
If we wish to reserve certain names going forward, I'd vote for reserving ids with an underscore on either side.

But really, the whole collision thing is overblown... this is a single name that people will not have used before. On a practical level, I don't believe it's an issue.
We will need another one too - as a container for document metadata.  I've suggested _meta_ for that in SOLR-705.

We aren't adding these all the time... there was exactly one before this.. "score".  No future document level metadata will collide since they will be contained in whatever _meta_ ends up being.

Further advantages to __id__  (single underscores surrounding the id):
 - consistent with magic fieldnames __query__ and __val__ for nested queries in the query parser, and I could see supporting __id__:1 in the future
 - people *may* want to return the actual ids for documents... wherever that info goes (separate return vector like sort_field_values for distributed search or __meta__) it will be nicer for clients if the label for it is actually an identifier and not '#'

> Enable sort by docid
> --------------------
>
>                 Key: SOLR-1478
>                 URL: https://issues.apache.org/jira/browse/SOLR-1478
>             Project: Solr
>          Issue Type: New Feature
>          Components: search
>            Reporter: Erik Hatcher
>            Priority: Minor
>             Fix For: 1.4
>
>         Attachments: SOLR-1478.patch
>
>
> Lucene allows sorting by docid, but Solr currently does not provide a way to specify it. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (SOLR-1478) Enable sort by docid

Posted by "Grant Ingersoll (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/SOLR-1478?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12761158#action_12761158 ] 

Grant Ingersoll commented on SOLR-1478:
---------------------------------------

Does Solr ever expose the docid to users?

> Enable sort by docid
> --------------------
>
>                 Key: SOLR-1478
>                 URL: https://issues.apache.org/jira/browse/SOLR-1478
>             Project: Solr
>          Issue Type: New Feature
>          Components: search
>            Reporter: Erik Hatcher
>            Priority: Minor
>             Fix For: 1.4
>
>         Attachments: SOLR-1478.patch
>
>
> Lucene allows sorting by docid, but Solr currently does not provide a way to specify it. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (SOLR-1478) Enable sort by docid

Posted by "Erik Hatcher (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/SOLR-1478?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Erik Hatcher updated SOLR-1478:
-------------------------------

    Attachment: SOLR-1478.patch

This patch adds a special sort field (like "score" is implemented) to enable sorting by docid.  

The character "#" was used simply to avoid any potential field name overlap, but this requires URL encoding it to %23, so maybe some other string should be used?  

Here's an example URL: http://localhost:8983/solr/select?q=*:*&sort=%23%20desc&fl=id

Seems like score and docid sorting should avoid using normal field name strings, so maybe _score_ and _docid_ or something.

I marked this for 1.4, because it's a trivial patch.  Discussion welcome.

> Enable sort by docid
> --------------------
>
>                 Key: SOLR-1478
>                 URL: https://issues.apache.org/jira/browse/SOLR-1478
>             Project: Solr
>          Issue Type: New Feature
>          Components: search
>            Reporter: Erik Hatcher
>            Priority: Minor
>             Fix For: 1.4
>
>         Attachments: SOLR-1478.patch
>
>
> Lucene allows sorting by docid, but Solr currently does not provide a way to specify it. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (SOLR-1478) Enable sort by docid

Posted by "Shalin Shekhar Mangar (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/SOLR-1478?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12763260#action_12763260 ] 

Shalin Shekhar Mangar commented on SOLR-1478:
---------------------------------------------

bq. I've been thinking _docid_ instead of _id_ since it's further from "id", what we normally use as the unique key field for documents.

+1

> Enable sort by docid
> --------------------
>
>                 Key: SOLR-1478
>                 URL: https://issues.apache.org/jira/browse/SOLR-1478
>             Project: Solr
>          Issue Type: New Feature
>          Components: search
>            Reporter: Erik Hatcher
>            Priority: Minor
>             Fix For: 1.4
>
>         Attachments: SOLR-1478.patch
>
>
> Lucene allows sorting by docid, but Solr currently does not provide a way to specify it. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (SOLR-1478) Enable sort by docid

Posted by "Yonik Seeley (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/SOLR-1478?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12763255#action_12763255 ] 

Yonik Seeley commented on SOLR-1478:
------------------------------------

I've been thinking \_docid_ instead of \_id_ since it's further from "id", what we normally use as the unique key field for documents.

OK, since Erik also proposed that as an alternative, and because Shalin also seems to be OK with that alternative, I'll commit that change unless I hear that more people favor a different alternative (keeping # or using \_id_)


> Enable sort by docid
> --------------------
>
>                 Key: SOLR-1478
>                 URL: https://issues.apache.org/jira/browse/SOLR-1478
>             Project: Solr
>          Issue Type: New Feature
>          Components: search
>            Reporter: Erik Hatcher
>            Priority: Minor
>             Fix For: 1.4
>
>         Attachments: SOLR-1478.patch
>
>
> Lucene allows sorting by docid, but Solr currently does not provide a way to specify it. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Issue Comment Edited: (SOLR-1478) Enable sort by docid

Posted by "Shalin Shekhar Mangar (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/SOLR-1478?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12761121#action_12761121 ] 

Shalin Shekhar Mangar edited comment on SOLR-1478 at 10/1/09 12:46 AM:
-----------------------------------------------------------------------

Perhaps something like _ DOCID _ instead of #. I am even tempted to suggest just using DOCID like we have SCORE.

[Edit] - Jira ate my suggestion

      was (Author: shalinmangar):
    Perhaps something like _DOCID_ instead of #. I am even tempted to suggest just using DOCID like we have SCORE.
  
> Enable sort by docid
> --------------------
>
>                 Key: SOLR-1478
>                 URL: https://issues.apache.org/jira/browse/SOLR-1478
>             Project: Solr
>          Issue Type: New Feature
>          Components: search
>            Reporter: Erik Hatcher
>            Priority: Minor
>             Fix For: 1.4
>
>         Attachments: SOLR-1478.patch
>
>
> Lucene allows sorting by docid, but Solr currently does not provide a way to specify it. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Resolved: (SOLR-1478) Enable sort by docid

Posted by "Grant Ingersoll (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/SOLR-1478?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Grant Ingersoll resolved SOLR-1478.
-----------------------------------

    Resolution: Fixed

> Enable sort by docid
> --------------------
>
>                 Key: SOLR-1478
>                 URL: https://issues.apache.org/jira/browse/SOLR-1478
>             Project: Solr
>          Issue Type: New Feature
>          Components: search
>            Reporter: Erik Hatcher
>            Priority: Minor
>             Fix For: 1.4
>
>         Attachments: SOLR-1478.patch
>
>
> Lucene allows sorting by docid, but Solr currently does not provide a way to specify it. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (SOLR-1478) Enable sort by docid

Posted by "Yonik Seeley (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/SOLR-1478?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12761694#action_12761694 ] 

Yonik Seeley commented on SOLR-1478:
------------------------------------

{code}
_id_
_docid_
{code}
?

The chance of collision is super low - I'd wager that no one has ever used __id__ in their schema (single underscores on either side... it's doubled to prevent wiki syntax from turning it into italics)

> Enable sort by docid
> --------------------
>
>                 Key: SOLR-1478
>                 URL: https://issues.apache.org/jira/browse/SOLR-1478
>             Project: Solr
>          Issue Type: New Feature
>          Components: search
>            Reporter: Erik Hatcher
>            Priority: Minor
>             Fix For: 1.4
>
>         Attachments: SOLR-1478.patch
>
>
> Lucene allows sorting by docid, but Solr currently does not provide a way to specify it. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (SOLR-1478) Enable sort by docid

Posted by "Shalin Shekhar Mangar (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/SOLR-1478?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12761121#action_12761121 ] 

Shalin Shekhar Mangar commented on SOLR-1478:
---------------------------------------------

Perhaps something like _DOCID_ instead of #. I am even tempted to suggest just using DOCID like we have SCORE.

> Enable sort by docid
> --------------------
>
>                 Key: SOLR-1478
>                 URL: https://issues.apache.org/jira/browse/SOLR-1478
>             Project: Solr
>          Issue Type: New Feature
>          Components: search
>            Reporter: Erik Hatcher
>            Priority: Minor
>             Fix For: 1.4
>
>         Attachments: SOLR-1478.patch
>
>
> Lucene allows sorting by docid, but Solr currently does not provide a way to specify it. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (SOLR-1478) Enable sort by docid

Posted by "Steven Rowe (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/SOLR-1478?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12761710#action_12761710 ] 

Steven Rowe commented on SOLR-1478:
-----------------------------------

Another thought: the XML specification reserves names matching regex {{/^xml/i}} to itself for future use (see http://www.w3.org/TR/xml/#sec-common-syn).  Maybe Solr should do the same?  That way, this discussion wouldn't have to be repeated for each new pseudo-field.

> Enable sort by docid
> --------------------
>
>                 Key: SOLR-1478
>                 URL: https://issues.apache.org/jira/browse/SOLR-1478
>             Project: Solr
>          Issue Type: New Feature
>          Components: search
>            Reporter: Erik Hatcher
>            Priority: Minor
>             Fix For: 1.4
>
>         Attachments: SOLR-1478.patch
>
>
> Lucene allows sorting by docid, but Solr currently does not provide a way to specify it. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Issue Comment Edited: (SOLR-1478) Enable sort by docid

Posted by "Steven Rowe (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/SOLR-1478?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12761699#action_12761699 ] 

Steven Rowe edited comment on SOLR-1478 at 10/2/09 1:26 PM:
------------------------------------------------------------

Providing aliases would allow all parties to get what they want.  Downside: maintenance/documentation issues with multiple syntaxes (minor IMHO).  Upside: collision probability goes down even further.

*edit* oops, completely wrong on the "upside" -- collision probability actually goes up, not down, since the set of noncolliding field names is reduced by each reserved pseudo-field name.  Still, aliases totally rock.

      was (Author: steve_rowe):
    Providing aliases would allow all parties to get what they want.  Downside: maintenance/documentation issues with multiple syntaxes (minor IMHO).  Upside: collision probability goes down even further.
  
> Enable sort by docid
> --------------------
>
>                 Key: SOLR-1478
>                 URL: https://issues.apache.org/jira/browse/SOLR-1478
>             Project: Solr
>          Issue Type: New Feature
>          Components: search
>            Reporter: Erik Hatcher
>            Priority: Minor
>             Fix For: 1.4
>
>         Attachments: SOLR-1478.patch
>
>
> Lucene allows sorting by docid, but Solr currently does not provide a way to specify it. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (SOLR-1478) Enable sort by docid

Posted by "Yonik Seeley (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/SOLR-1478?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12761581#action_12761581 ] 

Yonik Seeley commented on SOLR-1478:
------------------------------------

A few things I don't like about '#'
 -  unlike many other characters, the browser can't encode it for you. For example, I can type in "sort=foo desc" into my browser and it can encode the space for me.  If I type in a literal #, Solr will silently truncate the request at that point.  People will have trouble with this one.
 - it can require lexical modification to other parsers (as opposed to semantic modification).  Things like function queries or anything else that parse out field names or parameters would need to be modified at the lexical level to accept # - it's generally easier to just check for a special name.
 - it looks like a comment
 

> Enable sort by docid
> --------------------
>
>                 Key: SOLR-1478
>                 URL: https://issues.apache.org/jira/browse/SOLR-1478
>             Project: Solr
>          Issue Type: New Feature
>          Components: search
>            Reporter: Erik Hatcher
>            Priority: Minor
>             Fix For: 1.4
>
>         Attachments: SOLR-1478.patch
>
>
> Lucene allows sorting by docid, but Solr currently does not provide a way to specify it. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.