You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@lucene.apache.org by "Stu Hood (JIRA)" <ji...@apache.org> on 2010/08/28 23:00:52 UTC

[jira] Created: (LUCENE-2628) Extract OpenBitSet to Apache Commons

Extract OpenBitSet to Apache Commons
------------------------------------

                 Key: LUCENE-2628
                 URL: https://issues.apache.org/jira/browse/LUCENE-2628
             Project: Lucene - Java
          Issue Type: Wish
            Reporter: Stu Hood


o.a.l.util.OpenBitSet is a great alternative to java.util.BitSet, and it is generally useful outside of the search field. It would be great if OpenBitSet were available outside of Lucene proper, perhaps as part of Apache Commons.

Aside from the communication required to accomplish this, there is the small issue of OpenBitSet extending o.a.l.search.DocIdSet in Lucene 3.0. There is very little logic contained in DocIdSet, so it could probably become an interface: Lucene proper could then extend the extract version of OpenBitSet to implement DocIdSet.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


[jira] Commented: (LUCENE-2628) Extract OpenBitSet to Apache Commons

Posted by "Shai Erera (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/LUCENE-2628?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12903959#action_12903959 ] 

Shai Erera commented on LUCENE-2628:
------------------------------------

Well ... if Cassandra already depends on Lucene then you're right. But I guess it doesn't make sense to introduce a dependency on a 1MB JAR for a class like OBS. It's practically the same argument as to why not extract OBS into commons ... but of course the folks at Cassandra are free to choose whichever way works for them.

> Extract OpenBitSet to Apache Commons
> ------------------------------------
>
>                 Key: LUCENE-2628
>                 URL: https://issues.apache.org/jira/browse/LUCENE-2628
>             Project: Lucene - Java
>          Issue Type: Wish
>            Reporter: Stu Hood
>
> o.a.l.util.OpenBitSet is a great alternative to java.util.BitSet, and it is generally useful outside of the search field. It would be great if OpenBitSet were available outside of Lucene proper, perhaps as part of Apache Commons.
> Aside from the communication required to accomplish this, there is the small issue of OpenBitSet extending o.a.l.search.DocIdSet in Lucene 3.0. There is very little logic contained in DocIdSet, so it could probably become an interface: Lucene proper could then extend the extract version of OpenBitSet to implement DocIdSet.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


[jira] Commented: (LUCENE-2628) Extract OpenBitSet to Apache Commons

Posted by "Robert Muir (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/LUCENE-2628?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12904154#action_12904154 ] 

Robert Muir commented on LUCENE-2628:
-------------------------------------

bq. Should be pretty strong reasons to break things out into modules. I don't want to have to piece together 15 tiny jars every time i write a some simple lucene code.

I agree. an example that might fit this case would be queryparsing: 
* there is a mix of functionality strewn across lucene/solr. 
* most of them are 99% the same and differ only in one small piece
* its a back compat nightmare (given javacc grammars etc)
* functionality and consistency lags behind, maybe due to the above: e.g. lack of span support in most

for something like that, i wouldnt mind an additional jar if it was all cleaned up and organized into a module, where things like Version could be dropped
and it could evolve naturally: real versions and you just keep using the old jar if you want exact behavior.



> Extract OpenBitSet to Apache Commons
> ------------------------------------
>
>                 Key: LUCENE-2628
>                 URL: https://issues.apache.org/jira/browse/LUCENE-2628
>             Project: Lucene - Java
>          Issue Type: Wish
>            Reporter: Stu Hood
>
> o.a.l.util.OpenBitSet is a great alternative to java.util.BitSet, and it is generally useful outside of the search field. It would be great if OpenBitSet were available outside of Lucene proper, perhaps as part of Apache Commons.
> Aside from the communication required to accomplish this, there is the small issue of OpenBitSet extending o.a.l.search.DocIdSet in Lucene 3.0. There is very little logic contained in DocIdSet, so it could probably become an interface: Lucene proper could then extend the extract version of OpenBitSet to implement DocIdSet.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


[jira] Commented: (LUCENE-2628) Extract OpenBitSet to Apache Commons

Posted by "Simon Willnauer (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/LUCENE-2628?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12903983#action_12903983 ] 

Simon Willnauer commented on LUCENE-2628:
-----------------------------------------

bq. I'm not sure I like this. Most of the stuff in the util package should be marked lucene.internal anyway, so such a jar would be of minimal use.
You are probably right here and I did really expect that to be a serious option. I was just curious if other would like that so we could discuss the pros and cons. I personally think if it is only one class people should duplicate that code and try to keep in sync on their own. Yet the whole code duplication discussion is a valid point but not worth the hassle here. 

bq. Although some might consider utility stuff like this useful, I don't think we should be creating artifacts that arent focused on search. 
we should keep this stuff internal.
I actually don't agree on that, if a nice stand alone jar comes out of what we do here we should not hesitate to make it usable without the core but that might be a super rare case and might even have never happened before here but generally being against such a thing doesn't seem to be community or cross project friendly to me. I can understand the reasons why this issue has been created and I support not to push stuff towards a third party dependency from a lucene point of view (that is just not an option and should stay like that) but still we can try working with others to use lucene code in a more convenient way, search related or not.

> Extract OpenBitSet to Apache Commons
> ------------------------------------
>
>                 Key: LUCENE-2628
>                 URL: https://issues.apache.org/jira/browse/LUCENE-2628
>             Project: Lucene - Java
>          Issue Type: Wish
>            Reporter: Stu Hood
>
> o.a.l.util.OpenBitSet is a great alternative to java.util.BitSet, and it is generally useful outside of the search field. It would be great if OpenBitSet were available outside of Lucene proper, perhaps as part of Apache Commons.
> Aside from the communication required to accomplish this, there is the small issue of OpenBitSet extending o.a.l.search.DocIdSet in Lucene 3.0. There is very little logic contained in DocIdSet, so it could probably become an interface: Lucene proper could then extend the extract version of OpenBitSet to implement DocIdSet.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


[jira] Issue Comment Edited: (LUCENE-2628) Extract OpenBitSet to Apache Commons

Posted by "Chris A. Mattmann (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/LUCENE-2628?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12903985#action_12903985 ] 

Chris A. Mattmann edited comment on LUCENE-2628 at 8/29/10 12:47 PM:
---------------------------------------------------------------------

*puts on ASF member hat*

Unless the Lucene folks want to champion moving over the OpenBitSet class into commons, then this discussion is really moot. The reporter of this issue (Stu) is free to create a patch for commons that duplicates this class, and if the commons folks are willing to manage it over there, then that's on them. If the Lucene folks at some point down the road then decide to take Stu's commons contribution, and update their internal Lucene code to use the commons versions, then more power to them. If the opposite is true, that's fine as well. This is all ASLv2 licensed stuff in the end, and folks are free to fork. 

That said, it's on the maintainers of libraries in *both* communities to decide if things are worth the reorganization in the end. For the commons side, it seems like an easier job (accept patch, find someone to maintain it, e.g., Stu). On the Lucene side, depending on how much or how little reuse you want to do (reuse entirely, remove class in Lucene, to subclassing an internal Lucene class that depends on the new commons one, and/or overrides it, all the way to keeping a copy of the commons class in Lucene-ville), you guys can choose where to go from there. 

      was (Author: chrismattmann):
    *puts on ASF member hat*

Unless the Lucene folks want to champion moving over the OpenBitSet class into commons, then this discussion is really moot. The reporter of this issue (Stu) is free to create a patch for commons that duplicates this class, and if the commons folks are willing to manage it over there, then that's on them. If the Lucene folks at some point down the road then decide to take Stu's commons contribution, and update their internal Lucene code to use the commons versions, then more power to them. If the opposite is true, that's fine as well. This is all ASLv2 licensed stuff in the end, and folks are free to fork. 

That said, it's on the maintainers of libraries in *both* communities to decide if things are worth the reorganization in the end. For the commons side, it seems like an easier job (accept patch, find someone to maintain it, e.g., Stu). On the Lucene side, depending on how or how little reuse you want to do (reuse entirely, remove class in Lucene, to subclassing an internal Lucene class that depends on the new commons one, and/or overrides it, all the way to keeping a copy of the commons class in Lucene-ville). 
  
> Extract OpenBitSet to Apache Commons
> ------------------------------------
>
>                 Key: LUCENE-2628
>                 URL: https://issues.apache.org/jira/browse/LUCENE-2628
>             Project: Lucene - Java
>          Issue Type: Wish
>            Reporter: Stu Hood
>
> o.a.l.util.OpenBitSet is a great alternative to java.util.BitSet, and it is generally useful outside of the search field. It would be great if OpenBitSet were available outside of Lucene proper, perhaps as part of Apache Commons.
> Aside from the communication required to accomplish this, there is the small issue of OpenBitSet extending o.a.l.search.DocIdSet in Lucene 3.0. There is very little logic contained in DocIdSet, so it could probably become an interface: Lucene proper could then extend the extract version of OpenBitSet to implement DocIdSet.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


[jira] Commented: (LUCENE-2628) Extract OpenBitSet to Apache Commons

Posted by "Shai Erera (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/LUCENE-2628?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12903965#action_12903965 ] 

Shai Erera commented on LUCENE-2628:
------------------------------------

So let's close that issue?

> Extract OpenBitSet to Apache Commons
> ------------------------------------
>
>                 Key: LUCENE-2628
>                 URL: https://issues.apache.org/jira/browse/LUCENE-2628
>             Project: Lucene - Java
>          Issue Type: Wish
>            Reporter: Stu Hood
>
> o.a.l.util.OpenBitSet is a great alternative to java.util.BitSet, and it is generally useful outside of the search field. It would be great if OpenBitSet were available outside of Lucene proper, perhaps as part of Apache Commons.
> Aside from the communication required to accomplish this, there is the small issue of OpenBitSet extending o.a.l.search.DocIdSet in Lucene 3.0. There is very little logic contained in DocIdSet, so it could probably become an interface: Lucene proper could then extend the extract version of OpenBitSet to implement DocIdSet.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


[jira] Commented: (LUCENE-2628) Extract OpenBitSet to Apache Commons

Posted by "Robert Muir (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/LUCENE-2628?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12903990#action_12903990 ] 

Robert Muir commented on LUCENE-2628:
-------------------------------------

{quote}
I actually don't agree on that, if a nice stand alone jar comes out of what we do here we should not hesitate to make it usable without the core but that might be a super rare case and might even have never happened before here but generally being against such a thing doesn't seem to be community or cross project friendly to me.
{quote}

I'm sorry, I strongly disagree. I dont think we should publish artifacts such as "utility jars" that are not directly related to search.

Furthermore, I do think more things should be marked lucene.internal. In any other good size software project, there is a mechanism beyond java 'public/protected/private' keywords so that
# the code can be shared across packages for internal use
# the code is still not 'exposed' and supported as a public API.

In the JDK, this stuff might be under the package sun.*, in other projects, i have seen ".impl" packages.

The only mechanism we have to keep things reasonable from a support (back-compat, etc) standpoint is to mark things lucene.internal, and we should use it, or we prevent changes/progress by creating an unreasonable support burden. 


> Extract OpenBitSet to Apache Commons
> ------------------------------------
>
>                 Key: LUCENE-2628
>                 URL: https://issues.apache.org/jira/browse/LUCENE-2628
>             Project: Lucene - Java
>          Issue Type: Wish
>            Reporter: Stu Hood
>
> o.a.l.util.OpenBitSet is a great alternative to java.util.BitSet, and it is generally useful outside of the search field. It would be great if OpenBitSet were available outside of Lucene proper, perhaps as part of Apache Commons.
> Aside from the communication required to accomplish this, there is the small issue of OpenBitSet extending o.a.l.search.DocIdSet in Lucene 3.0. There is very little logic contained in DocIdSet, so it could probably become an interface: Lucene proper could then extend the extract version of OpenBitSet to implement DocIdSet.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


[jira] Commented: (LUCENE-2628) Extract OpenBitSet to Apache Commons

Posted by "Mark Miller (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/LUCENE-2628?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12904147#action_12904147 ] 

Mark Miller commented on LUCENE-2628:
-------------------------------------

Making modules for modules sake is not the right path I think. Some modules make sense and will be helpful - but modules in general are a pain in the ass for devs compared to one jar. Should be pretty strong reasons to break things out into modules. I don't want to have to piece together 15 tiny jars every time i write a some simple lucene code.

> Extract OpenBitSet to Apache Commons
> ------------------------------------
>
>                 Key: LUCENE-2628
>                 URL: https://issues.apache.org/jira/browse/LUCENE-2628
>             Project: Lucene - Java
>          Issue Type: Wish
>            Reporter: Stu Hood
>
> o.a.l.util.OpenBitSet is a great alternative to java.util.BitSet, and it is generally useful outside of the search field. It would be great if OpenBitSet were available outside of Lucene proper, perhaps as part of Apache Commons.
> Aside from the communication required to accomplish this, there is the small issue of OpenBitSet extending o.a.l.search.DocIdSet in Lucene 3.0. There is very little logic contained in DocIdSet, so it could probably become an interface: Lucene proper could then extend the extract version of OpenBitSet to implement DocIdSet.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


[jira] Commented: (LUCENE-2628) Extract OpenBitSet to Apache Commons

Posted by "Robert Muir (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/LUCENE-2628?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12903976#action_12903976 ] 

Robert Muir commented on LUCENE-2628:
-------------------------------------

{quote}
I still wanna throw one possible solution in for discussion. Since Lucene has many very useful classes in its utils package we could consider package the utils package as its own jar decoupled somehow from the core. That way people could use those classes without pulling in the whole lucene core and the interface would remain unchanged. It still feels odd that core would depend on utils jar but it would be an option and it would fit into the modules concept.
{quote}

I'm not sure I like this. Most of the stuff in the util package should be marked lucene.internal anyway, so such a jar would be of minimal use.

Furthermore, I think search related modules make sense, (spatial search, text analyzers, queries, queryparsers) because this is a search engine library project.
But i don't think we should deviate from that.

Although some might consider utility stuff like this useful, I don't think we should be creating artifacts that arent focused on search. 
we should keep this stuff internal.

> Extract OpenBitSet to Apache Commons
> ------------------------------------
>
>                 Key: LUCENE-2628
>                 URL: https://issues.apache.org/jira/browse/LUCENE-2628
>             Project: Lucene - Java
>          Issue Type: Wish
>            Reporter: Stu Hood
>
> o.a.l.util.OpenBitSet is a great alternative to java.util.BitSet, and it is generally useful outside of the search field. It would be great if OpenBitSet were available outside of Lucene proper, perhaps as part of Apache Commons.
> Aside from the communication required to accomplish this, there is the small issue of OpenBitSet extending o.a.l.search.DocIdSet in Lucene 3.0. There is very little logic contained in DocIdSet, so it could probably become an interface: Lucene proper could then extend the extract version of OpenBitSet to implement DocIdSet.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


[jira] Commented: (LUCENE-2628) Extract OpenBitSet to Apache Commons

Posted by "Chris A. Mattmann (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/LUCENE-2628?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12903985#action_12903985 ] 

Chris A. Mattmann commented on LUCENE-2628:
-------------------------------------------

*puts on ASF member hat*

Unless the Lucene folks want to champion moving over the OpenBitSet class into commons, then this discussion is really moot. The reporter of this issue (Stu) is free to create a patch for commons that duplicates this class, and if the commons folks are willing to manage it over there, then that's on them. If the Lucene folks at some point down the road then decide to take Stu's commons contribution, and update their internal Lucene code to use the commons versions, then more power to them. If the opposite is true, that's fine as well. This is all ASLv2 licensed stuff in the end, and folks are free to fork. 

That said, it's on the maintainers of libraries in *both* communities to decide if things are worth the reorganization in the end. For the commons side, it seems like an easier job (accept patch, find someone to maintain it, e.g., Stu). On the Lucene side, depending on how or how little reuse you want to do (reuse entirely, remove class in Lucene, to subclassing an internal Lucene class that depends on the new commons one, and/or overrides it, all the way to keeping a copy of the commons class in Lucene-ville). 

> Extract OpenBitSet to Apache Commons
> ------------------------------------
>
>                 Key: LUCENE-2628
>                 URL: https://issues.apache.org/jira/browse/LUCENE-2628
>             Project: Lucene - Java
>          Issue Type: Wish
>            Reporter: Stu Hood
>
> o.a.l.util.OpenBitSet is a great alternative to java.util.BitSet, and it is generally useful outside of the search field. It would be great if OpenBitSet were available outside of Lucene proper, perhaps as part of Apache Commons.
> Aside from the communication required to accomplish this, there is the small issue of OpenBitSet extending o.a.l.search.DocIdSet in Lucene 3.0. There is very little logic contained in DocIdSet, so it could probably become an interface: Lucene proper could then extend the extract version of OpenBitSet to implement DocIdSet.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


[jira] Commented: (LUCENE-2628) Extract OpenBitSet to Apache Commons

Posted by "Simon Willnauer (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/LUCENE-2628?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12904142#action_12904142 ] 

Simon Willnauer commented on LUCENE-2628:
-----------------------------------------

I agree with Robert and Shai on this issues and that other folks should just pull out the class they need. Adding it to commons makes sense and they should likely go that way. Still we should try to move the modularization forward as we did with analyzers etc. 
Yet making a module out of almost 100% lucene.internal API doesn't make sense and would bring lots of disadvantages as folk have outlined above. 

bq. But I'll fight this fight in another issue when I propose such a module 
Once it is beneficial for Lucene I am with you! I was only one more option which came to my mind.

That said, I suggest to close this issue and move forward. If nobody objects I am going to do that by the end of the day.

> Extract OpenBitSet to Apache Commons
> ------------------------------------
>
>                 Key: LUCENE-2628
>                 URL: https://issues.apache.org/jira/browse/LUCENE-2628
>             Project: Lucene - Java
>          Issue Type: Wish
>            Reporter: Stu Hood
>
> o.a.l.util.OpenBitSet is a great alternative to java.util.BitSet, and it is generally useful outside of the search field. It would be great if OpenBitSet were available outside of Lucene proper, perhaps as part of Apache Commons.
> Aside from the communication required to accomplish this, there is the small issue of OpenBitSet extending o.a.l.search.DocIdSet in Lucene 3.0. There is very little logic contained in DocIdSet, so it could probably become an interface: Lucene proper could then extend the extract version of OpenBitSet to implement DocIdSet.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


[jira] Commented: (LUCENE-2628) Extract OpenBitSet to Apache Commons

Posted by "Robert Muir (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/LUCENE-2628?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12904132#action_12904132 ] 

Robert Muir commented on LUCENE-2628:
-------------------------------------

Can you give a concrete example how a "utility jar" would be useful?

I didn't think so.

> Extract OpenBitSet to Apache Commons
> ------------------------------------
>
>                 Key: LUCENE-2628
>                 URL: https://issues.apache.org/jira/browse/LUCENE-2628
>             Project: Lucene - Java
>          Issue Type: Wish
>            Reporter: Stu Hood
>
> o.a.l.util.OpenBitSet is a great alternative to java.util.BitSet, and it is generally useful outside of the search field. It would be great if OpenBitSet were available outside of Lucene proper, perhaps as part of Apache Commons.
> Aside from the communication required to accomplish this, there is the small issue of OpenBitSet extending o.a.l.search.DocIdSet in Lucene 3.0. There is very little logic contained in DocIdSet, so it could probably become an interface: Lucene proper could then extend the extract version of OpenBitSet to implement DocIdSet.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


[jira] Commented: (LUCENE-2628) Extract OpenBitSet to Apache Commons

Posted by "Robert Muir (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/LUCENE-2628?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12903958#action_12903958 ] 

Robert Muir commented on LUCENE-2628:
-------------------------------------

if cassandra wants to use this, it should depend on lucene.

no duplication of code needed

> Extract OpenBitSet to Apache Commons
> ------------------------------------
>
>                 Key: LUCENE-2628
>                 URL: https://issues.apache.org/jira/browse/LUCENE-2628
>             Project: Lucene - Java
>          Issue Type: Wish
>            Reporter: Stu Hood
>
> o.a.l.util.OpenBitSet is a great alternative to java.util.BitSet, and it is generally useful outside of the search field. It would be great if OpenBitSet were available outside of Lucene proper, perhaps as part of Apache Commons.
> Aside from the communication required to accomplish this, there is the small issue of OpenBitSet extending o.a.l.search.DocIdSet in Lucene 3.0. There is very little logic contained in DocIdSet, so it could probably become an interface: Lucene proper could then extend the extract version of OpenBitSet to implement DocIdSet.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


[jira] Commented: (LUCENE-2628) Extract OpenBitSet to Apache Commons

Posted by "Robert Muir (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/LUCENE-2628?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12903885#action_12903885 ] 

Robert Muir commented on LUCENE-2628:
-------------------------------------

I think things like this should stay in Lucene and their implementations should remain specialized towards our use cases.

If we put them in apache commons and depend on them ourselves, then other use cases might want to make tradeoffs that
are bad for lucene (but perhaps good for other use cases).

Additionally the lucene core has no external dependencies, and I dont think we should add any.

In short, I don't see what we would gain here.

> Extract OpenBitSet to Apache Commons
> ------------------------------------
>
>                 Key: LUCENE-2628
>                 URL: https://issues.apache.org/jira/browse/LUCENE-2628
>             Project: Lucene - Java
>          Issue Type: Wish
>            Reporter: Stu Hood
>
> o.a.l.util.OpenBitSet is a great alternative to java.util.BitSet, and it is generally useful outside of the search field. It would be great if OpenBitSet were available outside of Lucene proper, perhaps as part of Apache Commons.
> Aside from the communication required to accomplish this, there is the small issue of OpenBitSet extending o.a.l.search.DocIdSet in Lucene 3.0. There is very little logic contained in DocIdSet, so it could probably become an interface: Lucene proper could then extend the extract version of OpenBitSet to implement DocIdSet.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


[jira] Commented: (LUCENE-2628) Extract OpenBitSet to Apache Commons

Posted by "Robert Muir (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/LUCENE-2628?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12904386#action_12904386 ] 

Robert Muir commented on LUCENE-2628:
-------------------------------------

bq. FWIW: I can ... the snag robert ran into in SOLR-2034. we don't want SolrJ to have a dependency on lucene-core, but it would be nice to re-use the UTF-8 serialization code instead of duplicating it

But maybe not, the stuff in unicodeutil isnt the best there anyway as its doing either:
* incremental conversion [and wasting cpu updating useless offsets]
* computing terms hash as it goes [and wasting cpu computing useless hash codes]


> Extract OpenBitSet to Apache Commons
> ------------------------------------
>
>                 Key: LUCENE-2628
>                 URL: https://issues.apache.org/jira/browse/LUCENE-2628
>             Project: Lucene - Java
>          Issue Type: Wish
>            Reporter: Stu Hood
>
> o.a.l.util.OpenBitSet is a great alternative to java.util.BitSet, and it is generally useful outside of the search field. It would be great if OpenBitSet were available outside of Lucene proper, perhaps as part of Apache Commons.
> Aside from the communication required to accomplish this, there is the small issue of OpenBitSet extending o.a.l.search.DocIdSet in Lucene 3.0. There is very little logic contained in DocIdSet, so it could probably become an interface: Lucene proper could then extend the extract version of OpenBitSet to implement DocIdSet.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


[jira] Commented: (LUCENE-2628) Extract OpenBitSet to Apache Commons

Posted by "Hoss Man (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/LUCENE-2628?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12904379#action_12904379 ] 

Hoss Man commented on LUCENE-2628:
----------------------------------

bq. so lets be honest, the "lucene" part boils down to: 'please delete this class and depend on this jar file instead.

agreed ... if commons wants to include OpenBitSet, and promote it's use to the java community at large, i'm all for that, but i don' really see any Lucene issue here at the moment.  If Commons's version of OpenBitSet takes off and becomes the defacto "bit set" impl people us in Java, then Lucene may want to reconsider it's current "no deps for core" policy and start depending on commons-bitset, but we aren't there yet, so we aren't there yet, so this is really a non-issue.


{quote}
I think a util jar is a great idea but not so we can publish it for others. As we modularise more, there will be utility classes that are useful across multiple modules. I dont think they should be stuck into lucene-core just because its the only consistent dependency. But I don't think OBS fits into this pool necessary since it really is tuned for the search func in lucene-core. 
{quote}
...
{quote}
Can you give a concrete example how a "utility jar" would be useful?

I didn't think so.
Can you give a concrete example how a "utility jar" would be useful? 

I didn't think so.
{quote}

FWIW: I can ... the snag robert ran into in SOLR-2034.  we don't want SolrJ to have a dependency on lucene-core, but it would be nice to re-use the UTF-8 serialization code instead of duplicating it.

> Extract OpenBitSet to Apache Commons
> ------------------------------------
>
>                 Key: LUCENE-2628
>                 URL: https://issues.apache.org/jira/browse/LUCENE-2628
>             Project: Lucene - Java
>          Issue Type: Wish
>            Reporter: Stu Hood
>
> o.a.l.util.OpenBitSet is a great alternative to java.util.BitSet, and it is generally useful outside of the search field. It would be great if OpenBitSet were available outside of Lucene proper, perhaps as part of Apache Commons.
> Aside from the communication required to accomplish this, there is the small issue of OpenBitSet extending o.a.l.search.DocIdSet in Lucene 3.0. There is very little logic contained in DocIdSet, so it could probably become an interface: Lucene proper could then extend the extract version of OpenBitSet to implement DocIdSet.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


[jira] Commented: (LUCENE-2628) Extract OpenBitSet to Apache Commons

Posted by "Chris Male (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/LUCENE-2628?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12904067#action_12904067 ] 

Chris Male commented on LUCENE-2628:
------------------------------------

I think a util jar is a great idea but not so we can publish it for others.  As we modularise more, there will be utility classes that are useful across multiple modules.  I dont think they should be stuck into lucene-core just because its the only consistent dependency.  But I don't think OBS fits into this pool necessary since it really is tuned for the search func in lucene-core.

But I'll fight this fight in another issue when I propose such a module :)

> Extract OpenBitSet to Apache Commons
> ------------------------------------
>
>                 Key: LUCENE-2628
>                 URL: https://issues.apache.org/jira/browse/LUCENE-2628
>             Project: Lucene - Java
>          Issue Type: Wish
>            Reporter: Stu Hood
>
> o.a.l.util.OpenBitSet is a great alternative to java.util.BitSet, and it is generally useful outside of the search field. It would be great if OpenBitSet were available outside of Lucene proper, perhaps as part of Apache Commons.
> Aside from the communication required to accomplish this, there is the small issue of OpenBitSet extending o.a.l.search.DocIdSet in Lucene 3.0. There is very little logic contained in DocIdSet, so it could probably become an interface: Lucene proper could then extend the extract version of OpenBitSet to implement DocIdSet.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


[jira] Commented: (LUCENE-2628) Extract OpenBitSet to Apache Commons

Posted by "Robert Muir (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/LUCENE-2628?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12903961#action_12903961 ] 

Robert Muir commented on LUCENE-2628:
-------------------------------------

bq. but of course the folks at Cassandra are free to choose whichever way works for them. 

exactly, there doesnt need to be a lucene issue to put this code somewhere else. this can be done now.

so lets be honest, the "lucene" part boils down to: 'please delete this class and depend on this jar file instead. and by the way, you won't be able to commit to the code anymore'. 

good luck

> Extract OpenBitSet to Apache Commons
> ------------------------------------
>
>                 Key: LUCENE-2628
>                 URL: https://issues.apache.org/jira/browse/LUCENE-2628
>             Project: Lucene - Java
>          Issue Type: Wish
>            Reporter: Stu Hood
>
> o.a.l.util.OpenBitSet is a great alternative to java.util.BitSet, and it is generally useful outside of the search field. It would be great if OpenBitSet were available outside of Lucene proper, perhaps as part of Apache Commons.
> Aside from the communication required to accomplish this, there is the small issue of OpenBitSet extending o.a.l.search.DocIdSet in Lucene 3.0. There is very little logic contained in DocIdSet, so it could probably become an interface: Lucene proper could then extend the extract version of OpenBitSet to implement DocIdSet.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


[jira] Closed: (LUCENE-2628) Extract OpenBitSet to Apache Commons

Posted by "Shai Erera (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/LUCENE-2628?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Shai Erera closed LUCENE-2628.
------------------------------

    Resolution: Not A Problem

Closing the issue. This clearly isn't a Lucene issue. Like many people said, Stu can take OBS into commons and Lucene can decide later if it wants to adopt the commons version or not. The discussion on independent .jars or not can continue on the mailing list - I assume more people will read it there than when it's inside that issue (and also it's irrelevant to the issue itself).

> Extract OpenBitSet to Apache Commons
> ------------------------------------
>
>                 Key: LUCENE-2628
>                 URL: https://issues.apache.org/jira/browse/LUCENE-2628
>             Project: Lucene - Java
>          Issue Type: Wish
>            Reporter: Stu Hood
>
> o.a.l.util.OpenBitSet is a great alternative to java.util.BitSet, and it is generally useful outside of the search field. It would be great if OpenBitSet were available outside of Lucene proper, perhaps as part of Apache Commons.
> Aside from the communication required to accomplish this, there is the small issue of OpenBitSet extending o.a.l.search.DocIdSet in Lucene 3.0. There is very little logic contained in DocIdSet, so it could probably become an interface: Lucene proper could then extend the extract version of OpenBitSet to implement DocIdSet.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


[jira] Commented: (LUCENE-2628) Extract OpenBitSet to Apache Commons

Posted by "Simon Willnauer (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/LUCENE-2628?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12903973#action_12903973 ] 

Simon Willnauer commented on LUCENE-2628:
-----------------------------------------

{quote}
Well ... if Cassandra already depends on Lucene then you're right. But I guess it doesn't make sense to introduce a dependency on a 1MB JAR for a class like OBS. It's practically the same argument as to why not extract OBS into commons ... but of course the folks at Cassandra are free to choose whichever way works for them.
{quote}

I agree with both Shai and Robert that the nature of Lucene's specialized utility classes are that they are still special purpose and the should not be changed just for the sake of making them usable by other projects because the like a class or two.

I still wanna throw one possible solution in for discussion. Since Lucene has many very useful classes in its utils package we could consider package the utils package as its own jar decoupled somehow from the core. That way people could use those classes without pulling in the whole lucene core and the interface would remain unchanged. It still feels odd that core would depend on utils jar but it would be an option and it would fit into the modules concept.


bq. So let's close that issue?
+1


> Extract OpenBitSet to Apache Commons
> ------------------------------------
>
>                 Key: LUCENE-2628
>                 URL: https://issues.apache.org/jira/browse/LUCENE-2628
>             Project: Lucene - Java
>          Issue Type: Wish
>            Reporter: Stu Hood
>
> o.a.l.util.OpenBitSet is a great alternative to java.util.BitSet, and it is generally useful outside of the search field. It would be great if OpenBitSet were available outside of Lucene proper, perhaps as part of Apache Commons.
> Aside from the communication required to accomplish this, there is the small issue of OpenBitSet extending o.a.l.search.DocIdSet in Lucene 3.0. There is very little logic contained in DocIdSet, so it could probably become an interface: Lucene proper could then extend the extract version of OpenBitSet to implement DocIdSet.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


[jira] Commented: (LUCENE-2628) Extract OpenBitSet to Apache Commons

Posted by "Shai Erera (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/LUCENE-2628?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12903938#action_12903938 ] 

Shai Erera commented on LUCENE-2628:
------------------------------------

I agree w/ Robert. There are several drawbacks to externalizing Lucene code to Apache commons (or any other project):
# If we introduce a dependency on Apache commons in core, what will prevent us from adding more dependencies on other Apache projects in the future?
# This is sort of a tiny class, however very important. If tomorrow a change will be required, how will we get it into Lucene? Wait until the next commons release?
#* The point here is that Lucene already includes a bunch of such sophisticated classes because their counterpart in Java land is not optimal, and trying to fix it there and wait for a release is out of the question. Therefore I'm afraid that even if we externalize such a utility to commons, sometime in the future it might find its way back in Lucene (and while I don't claim to see the future, this has happened before already, therefore it's only logical it will happen again).
# Why stop w/ OpenBitSet? ArrayUtil has several useful methods as well. Perhaps we should externalize it too?

A different approach would be to copy-paste OBS code in Apache commons and let both versions live at the same time. Cassandra can depend on the one from commons if it wants to. If indeed nothing changes in OBS, then it's a win-win. If however OBS at commons will change drastically, and will e.g. perform faster, then we can open another issue about adopting it back into Lucene (in the form of copy-paste or dependency). I don't see anything that prevents you from starting w/ that approach, right?

Core Lucene has managed to avoid external dependencies so far, and I don't see a reason to introduce one for this class alone.

> Extract OpenBitSet to Apache Commons
> ------------------------------------
>
>                 Key: LUCENE-2628
>                 URL: https://issues.apache.org/jira/browse/LUCENE-2628
>             Project: Lucene - Java
>          Issue Type: Wish
>            Reporter: Stu Hood
>
> o.a.l.util.OpenBitSet is a great alternative to java.util.BitSet, and it is generally useful outside of the search field. It would be great if OpenBitSet were available outside of Lucene proper, perhaps as part of Apache Commons.
> Aside from the communication required to accomplish this, there is the small issue of OpenBitSet extending o.a.l.search.DocIdSet in Lucene 3.0. There is very little logic contained in DocIdSet, so it could probably become an interface: Lucene proper could then extend the extract version of OpenBitSet to implement DocIdSet.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


[jira] Commented: (LUCENE-2628) Extract OpenBitSet to Apache Commons

Posted by "Stu Hood (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/LUCENE-2628?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12903931#action_12903931 ] 

Stu Hood commented on LUCENE-2628:
----------------------------------

> then other use cases might want to make tradeoffs that are bad for lucene
I can't claim to see the future, but I think this datastructure is already so specialized and solid that the chances of it evolving much further are slim.

> Additionally the lucene core has no external dependencies, and I dont think we should add any.
While this is definitely admirable and worth trying to preserve, it should be weighed against the duplication it incurs: especially within the Apache project itself. For example if the Cassandra project takes the same stance, then we end up with two copies of this class.

> Extract OpenBitSet to Apache Commons
> ------------------------------------
>
>                 Key: LUCENE-2628
>                 URL: https://issues.apache.org/jira/browse/LUCENE-2628
>             Project: Lucene - Java
>          Issue Type: Wish
>            Reporter: Stu Hood
>
> o.a.l.util.OpenBitSet is a great alternative to java.util.BitSet, and it is generally useful outside of the search field. It would be great if OpenBitSet were available outside of Lucene proper, perhaps as part of Apache Commons.
> Aside from the communication required to accomplish this, there is the small issue of OpenBitSet extending o.a.l.search.DocIdSet in Lucene 3.0. There is very little logic contained in DocIdSet, so it could probably become an interface: Lucene proper could then extend the extract version of OpenBitSet to implement DocIdSet.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


[jira] Commented: (LUCENE-2628) Extract OpenBitSet to Apache Commons

Posted by "Paul Elschot (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/LUCENE-2628?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12903947#action_12903947 ] 

Paul Elschot commented on LUCENE-2628:
--------------------------------------

Using an interface for DocIdSet would freeze DocIdSet forever and that would not be advisable.

Nowadays the JITs might allow the use of delegation instead of inheritance without losing performance,
so one might reasonably try to delegate to OBS to avoid freezing DocIdSet. That would involve an extra class, but an extra class should be no problem.



> Extract OpenBitSet to Apache Commons
> ------------------------------------
>
>                 Key: LUCENE-2628
>                 URL: https://issues.apache.org/jira/browse/LUCENE-2628
>             Project: Lucene - Java
>          Issue Type: Wish
>            Reporter: Stu Hood
>
> o.a.l.util.OpenBitSet is a great alternative to java.util.BitSet, and it is generally useful outside of the search field. It would be great if OpenBitSet were available outside of Lucene proper, perhaps as part of Apache Commons.
> Aside from the communication required to accomplish this, there is the small issue of OpenBitSet extending o.a.l.search.DocIdSet in Lucene 3.0. There is very little logic contained in DocIdSet, so it could probably become an interface: Lucene proper could then extend the extract version of OpenBitSet to implement DocIdSet.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org