You are viewing a plain text version of this content. The canonical link for it is here.
Posted to solr-dev@lucene.apache.org by "Yonik Seeley (JIRA)" <ji...@apache.org> on 2007/12/12 05:45:43 UTC

[jira] Created: (SOLR-434) interfaces should support >2B docs

interfaces should support >2B docs
----------------------------------

                 Key: SOLR-434
                 URL: https://issues.apache.org/jira/browse/SOLR-434
             Project: Solr
          Issue Type: Improvement
            Reporter: Yonik Seeley
            Priority: Minor


External interfaces that deal with numbers of documents should eventually be able to deal with > 2B documents (that means long instead of int).

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (SOLR-434) interfaces should support >2B docs

Posted by "Ryan McKinley (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/SOLR-434?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12554928 ] 

Ryan McKinley commented on SOLR-434:
------------------------------------

> 
>> I think we can safely change all integers to long without problem.
> 
> So you mean, when SolrJ encounters a <int>100</int> it will create a new Long(100)?

Yes, thats what I'm suggesting.  Likewise when it hits numFound=X and start=Y, it would also store a long.

> That wouldn't really be backward compatible with SolrJ users, but we haven't had a SolrJ release yet.
> 

Since their has not been a release, I think that is a reasonable change.    For most use, the only thing people may see is a compiler warning for
 int count = results.getNumFound();

>> I don't think the external api <int> makes an contract to say the value will fit within the java int range.
> 
> The only issue is that there is a <long> tag...
> I don't think Solr currently uses Long objects for serialization, but long field types currently use the <long> tag.
> 

good point -- if you have <int> and <long> one would think they mean something different... also for NamedList configuration (NamedListInitalizedPlugin), <int> really expects to be a java int field.



> interfaces should support >2B docs
> ----------------------------------
>
>                 Key: SOLR-434
>                 URL: https://issues.apache.org/jira/browse/SOLR-434
>             Project: Solr
>          Issue Type: Improvement
>            Reporter: Yonik Seeley
>            Priority: Minor
>             Fix For: 1.3
>
>
> External interfaces that deal with numbers of documents should eventually be able to deal with > 2B documents (that means long instead of int).

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (SOLR-434) interfaces should support >2B docs

Posted by "Yonik Seeley (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/SOLR-434?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12550835 ] 

Yonik Seeley commented on SOLR-434:
-----------------------------------

Note: this doesn't really apply to single instances of Solr/Lucene... 2B ids will be fine for quite some time.
However, distributed search can push over that limit, and we should prepare for it.

- SolrJ's SolrDocumentList should have numFound and start be a longs
- how to handle these numbers in the XML format (change to long at some point... bump version?).  These don't present a problem in JSON/Python since integers don't have limits.
- same issue with facet counts... should they be defined to be <int> unless individual values are large enough to overflow, or should we just change them to long in SolrJ and the XML?


> interfaces should support >2B docs
> ----------------------------------
>
>                 Key: SOLR-434
>                 URL: https://issues.apache.org/jira/browse/SOLR-434
>             Project: Solr
>          Issue Type: Improvement
>            Reporter: Yonik Seeley
>            Priority: Minor
>
> External interfaces that deal with numbers of documents should eventually be able to deal with > 2B documents (that means long instead of int).

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (SOLR-434) interfaces should support >2B docs

Posted by "Yonik Seeley (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/SOLR-434?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12550836 ] 

Yonik Seeley commented on SOLR-434:
-----------------------------------

The other backward-compatible alternative for the xml format is to redefine what <int> means.... basically treat it as an unbounded int.

> interfaces should support >2B docs
> ----------------------------------
>
>                 Key: SOLR-434
>                 URL: https://issues.apache.org/jira/browse/SOLR-434
>             Project: Solr
>          Issue Type: Improvement
>            Reporter: Yonik Seeley
>            Priority: Minor
>
> External interfaces that deal with numbers of documents should eventually be able to deal with > 2B documents (that means long instead of int).

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (SOLR-434) interfaces should support >2B docs

Posted by "Ryan McKinley (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/SOLR-434?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Ryan McKinley updated SOLR-434:
-------------------------------

    Attachment: SOLR-434-LongDocCount.patch

switches solrj + SolrDocumentList representation from int -> long

Facet counts will accept <int> or <long>

> interfaces should support >2B docs
> ----------------------------------
>
>                 Key: SOLR-434
>                 URL: https://issues.apache.org/jira/browse/SOLR-434
>             Project: Solr
>          Issue Type: Improvement
>            Reporter: Yonik Seeley
>            Priority: Minor
>             Fix For: 1.3
>
>         Attachments: SOLR-434-LongDocCount.patch
>
>
> External interfaces that deal with numbers of documents should eventually be able to deal with > 2B documents (that means long instead of int).

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (SOLR-434) interfaces should support >2B docs

Posted by "Yonik Seeley (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/SOLR-434?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12554844 ] 

Yonik Seeley commented on SOLR-434:
-----------------------------------

> I think we can safely change all integers to long without problem.

So you mean, when SolrJ encounters a <int>100</int> it will create a new Long(100)?
That wouldn't really be backward compatible with SolrJ users, but we haven't had a SolrJ release yet.

> I don't think the external api <int> makes an contract to say the value will fit within the java int range.

The only issue is that there is a <long> tag...
I don't think Solr currently uses Long objects for serialization, but long field types currently use the <long> tag.


> interfaces should support >2B docs
> ----------------------------------
>
>                 Key: SOLR-434
>                 URL: https://issues.apache.org/jira/browse/SOLR-434
>             Project: Solr
>          Issue Type: Improvement
>            Reporter: Yonik Seeley
>            Priority: Minor
>             Fix For: 1.3
>
>
> External interfaces that deal with numbers of documents should eventually be able to deal with > 2B documents (that means long instead of int).

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (SOLR-434) interfaces should support >2B docs

Posted by "Yonik Seeley (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/SOLR-434?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12557136#action_12557136 ] 

Yonik Seeley commented on SOLR-434:
-----------------------------------

Looks good.
In my distributed faceting code I'm using ((Number)o).longValue() right now anyway.

> interfaces should support >2B docs
> ----------------------------------
>
>                 Key: SOLR-434
>                 URL: https://issues.apache.org/jira/browse/SOLR-434
>             Project: Solr
>          Issue Type: Improvement
>            Reporter: Yonik Seeley
>            Priority: Minor
>             Fix For: 1.3
>
>         Attachments: SOLR-434-LongDocCount.patch
>
>
> External interfaces that deal with numbers of documents should eventually be able to deal with > 2B documents (that means long instead of int).

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (SOLR-434) interfaces should support >2B docs

Posted by "Ryan McKinley (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/SOLR-434?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Ryan McKinley updated SOLR-434:
-------------------------------

    Fix Version/s: 1.3

solrj can change everything to long without problem.

I think we can safely change all integers to long without problem.  I don't think the external api <int> makes an contract to say the value will fit within the java int range.  As you said, "basically treat it as an unbounded int"

> interfaces should support >2B docs
> ----------------------------------
>
>                 Key: SOLR-434
>                 URL: https://issues.apache.org/jira/browse/SOLR-434
>             Project: Solr
>          Issue Type: Improvement
>            Reporter: Yonik Seeley
>            Priority: Minor
>             Fix For: 1.3
>
>
> External interfaces that deal with numbers of documents should eventually be able to deal with > 2B documents (that means long instead of int).

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Resolved: (SOLR-434) interfaces should support >2B docs

Posted by "Ryan McKinley (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/SOLR-434?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Ryan McKinley resolved SOLR-434.
--------------------------------

    Resolution: Fixed
      Assignee: Ryan McKinley

> interfaces should support >2B docs
> ----------------------------------
>
>                 Key: SOLR-434
>                 URL: https://issues.apache.org/jira/browse/SOLR-434
>             Project: Solr
>          Issue Type: Improvement
>            Reporter: Yonik Seeley
>            Assignee: Ryan McKinley
>            Priority: Minor
>             Fix For: 1.3
>
>         Attachments: SOLR-434-LongDocCount.patch
>
>
> External interfaces that deal with numbers of documents should eventually be able to deal with > 2B documents (that means long instead of int).

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.