You are viewing a plain text version of this content. The canonical link for it is here.
Posted to solr-dev@lucene.apache.org by "Kay Kay (JIRA)" <ji...@apache.org> on 2008/12/14 05:48:44 UTC

[jira] Created: (SOLR-912) org.apache.solr.common.util.NamedList - Typesafe efficient variant - ModernNamedList introduced - implementing the same API as NamedList

org.apache.solr.common.util.NamedList - Typesafe efficient variant - ModernNamedList introduced - implementing the same API as NamedList
----------------------------------------------------------------------------------------------------------------------------------------

                 Key: SOLR-912
                 URL: https://issues.apache.org/jira/browse/SOLR-912
             Project: Solr
          Issue Type: Improvement
          Components: clients - java
    Affects Versions: 1.4
         Environment: Tomcat 6, JRE 6, Solr 1.3+ nightlies 
            Reporter: Kay Kay
            Priority: Minor
             Fix For: 1.3.1


The implementation of NamedList - while being fast - is not necessarily type-safe. I have implemented an additional implementation of the same - ModernNamedList (a type-safe variation providing the same interface as NamedList) - while preserving the semantics in terms of ordering of elements and allowing null elements for key and values (keys are always Strings , while values correspond to generics ). 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (SOLR-912) org.apache.solr.common.util.NamedList - Typesafe efficient variant - ModernNamedList introduced - implementing the same API as NamedList

Posted by "Kay Kay (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/SOLR-912?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Kay Kay updated SOLR-912:
-------------------------

    Component/s:     (was: clients - java)
                 Analysis

> org.apache.solr.common.util.NamedList - Typesafe efficient variant - ModernNamedList introduced - implementing the same API as NamedList
> ----------------------------------------------------------------------------------------------------------------------------------------
>
>                 Key: SOLR-912
>                 URL: https://issues.apache.org/jira/browse/SOLR-912
>             Project: Solr
>          Issue Type: Improvement
>          Components: Analysis
>    Affects Versions: 1.4
>         Environment: Tomcat 6, JRE 6, Solr 1.3+ nightlies 
>            Reporter: Kay Kay
>            Priority: Minor
>             Fix For: 1.3.1
>
>         Attachments: SOLR-912.patch
>
>   Original Estimate: 72h
>  Remaining Estimate: 72h
>
> The implementation of NamedList - while being fast - is not necessarily type-safe. I have implemented an additional implementation of the same - ModernNamedList (a type-safe variation providing the same interface as NamedList) - while preserving the semantics in terms of ordering of elements and allowing null elements for key and values (keys are always Strings , while values correspond to generics ). 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (SOLR-912) org.apache.solr.common.util.NamedList - Typesafe efficient variant - ModernNamedList introduced - implementing the same API as NamedList

Posted by "Kay Kay (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/SOLR-912?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Kay Kay updated SOLR-912:
-------------------------

    Attachment: SOLR-912.patch

Introduce another ctor. called   Type(Object [] ) to distinguish them from List<Map.Entry<String, T > > and List of objects. 

Change the invocation in DebugComponent   . Highlight Component etc. 

> org.apache.solr.common.util.NamedList - Typesafe efficient variant - ModernNamedList introduced - implementing the same API as NamedList
> ----------------------------------------------------------------------------------------------------------------------------------------
>
>                 Key: SOLR-912
>                 URL: https://issues.apache.org/jira/browse/SOLR-912
>             Project: Solr
>          Issue Type: Improvement
>          Components: search
>    Affects Versions: 1.4
>         Environment: Tomcat 6, JRE 6, Solr 1.3+ nightlies 
>            Reporter: Kay Kay
>            Priority: Minor
>             Fix For: 1.4
>
>         Attachments: NLProfile.java, SOLR-912.patch, SOLR-912.patch
>
>   Original Estimate: 72h
>  Remaining Estimate: 72h
>
> The implementation of NamedList - while being fast - is not necessarily type-safe. I have implemented an additional implementation of the same - ModernNamedList (a type-safe variation providing the same interface as NamedList) - while preserving the semantics in terms of ordering of elements and allowing null elements for key and values (keys are always Strings , while values correspond to generics ). 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (SOLR-912) org.apache.solr.common.util.NamedList - Typesafe efficient variant - ModernNamedList introduced - implementing the same API as NamedList

Posted by "Kay Kay (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/SOLR-912?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12657630#action_12657630 ] 

Kay Kay commented on SOLR-912:
------------------------------

Additional Info: JRE 6,  Linux 2.6.27-9 ,  3.2GB Memory, Dual-core Intel @ 2.53 Ghz. 

> org.apache.solr.common.util.NamedList - Typesafe efficient variant - ModernNamedList introduced - implementing the same API as NamedList
> ----------------------------------------------------------------------------------------------------------------------------------------
>
>                 Key: SOLR-912
>                 URL: https://issues.apache.org/jira/browse/SOLR-912
>             Project: Solr
>          Issue Type: Improvement
>          Components: Analysis
>    Affects Versions: 1.4
>         Environment: Tomcat 6, JRE 6, Solr 1.3+ nightlies 
>            Reporter: Kay Kay
>            Priority: Minor
>             Fix For: 1.3.1
>
>         Attachments: NLProfile.java, SOLR-912.patch
>
>   Original Estimate: 72h
>  Remaining Estimate: 72h
>
> The implementation of NamedList - while being fast - is not necessarily type-safe. I have implemented an additional implementation of the same - ModernNamedList (a type-safe variation providing the same interface as NamedList) - while preserving the semantics in terms of ordering of elements and allowing null elements for key and values (keys are always Strings , while values correspond to generics ). 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Reopened: (SOLR-912) org.apache.solr.common.util.NamedList - Typesafe efficient variant - ModernNamedList introduced - implementing the same API as NamedList

Posted by "Hoss Man (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/SOLR-912?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Hoss Man reopened SOLR-912:
---------------------------


reopening for the future

> org.apache.solr.common.util.NamedList - Typesafe efficient variant - ModernNamedList introduced - implementing the same API as NamedList
> ----------------------------------------------------------------------------------------------------------------------------------------
>
>                 Key: SOLR-912
>                 URL: https://issues.apache.org/jira/browse/SOLR-912
>             Project: Solr
>          Issue Type: Improvement
>          Components: search
>         Environment: Tomcat 6, JRE 6, Solr 1.3+ nightlies 
>            Reporter: Kay Kay
>            Priority: Minor
>         Attachments: NLProfile.java, SOLR-912.patch, SOLR-912.patch
>
>   Original Estimate: 72h
>  Remaining Estimate: 72h
>
> The implementation of NamedList - while being fast - is not necessarily type-safe. I have implemented an additional implementation of the same - ModernNamedList (a type-safe variation providing the same interface as NamedList) - while preserving the semantics in terms of ordering of elements and allowing null elements for key and values (keys are always Strings , while values correspond to generics ). 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (SOLR-912) org.apache.solr.common.util.NamedList - Typesafe efficient variant - ModernNamedList introduced - implementing the same API as NamedList

Posted by "Noble Paul (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/SOLR-912?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12656399#action_12656399 ] 

Noble Paul commented on SOLR-912:
---------------------------------

bq.The interface is present only to enable the migration from NamedList (legacy) to the new one. (with similar properties of Cloneable, Serializable etc. ).
This means we will need to use this interface wherever we use NamedList which is not desirable

NamedList is designed to achieve a specific purpose 

Solr is not meant for java users only. It is also meant to be consumed over xml/json etc .There is no type safety in these. NamedList helps us to have a datastructure which can easily be converted back and forth from these. 

bq.If type-safety is not a concern - is there a reason why NamedList<T> is defined as a generic type
It helps where it makes sense . But where it is not necessary I can totally omit that and javac does not complain. So , it is better to keep it generic than not. 

bq.There seems to be no memory leaks as far as the container is concerned

There are no leaks. I was referring to the internal implementation . instead of one big array you keep a list of objects (means more objects, one per entry) .  

bq.Creating an iterator object for every call to an iterator 

iterating over NamedList does not need to use an iterator. That is a choice left to the consumer of the API

> org.apache.solr.common.util.NamedList - Typesafe efficient variant - ModernNamedList introduced - implementing the same API as NamedList
> ----------------------------------------------------------------------------------------------------------------------------------------
>
>                 Key: SOLR-912
>                 URL: https://issues.apache.org/jira/browse/SOLR-912
>             Project: Solr
>          Issue Type: Improvement
>          Components: clients - java
>    Affects Versions: 1.4
>         Environment: Tomcat 6, JRE 6, Solr 1.3+ nightlies 
>            Reporter: Kay Kay
>            Priority: Minor
>             Fix For: 1.3.1
>
>         Attachments: SOLR-912.patch
>
>   Original Estimate: 72h
>  Remaining Estimate: 72h
>
> The implementation of NamedList - while being fast - is not necessarily type-safe. I have implemented an additional implementation of the same - ModernNamedList (a type-safe variation providing the same interface as NamedList) - while preserving the semantics in terms of ordering of elements and allowing null elements for key and values (keys are always Strings , while values correspond to generics ). 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Re: [jira] Commented: (SOLR-912) org.apache.solr.common.util.NamedList - Typesafe efficient variant - ModernNamedList introduced - implementing the same API as NamedList

Posted by Kay Kay <ka...@gmail.com>.
On Fri, Dec 19, 2008 at 8:35 PM, Yonik Seeley <ys...@gmail.com> wrote:

> On Fri, Dec 19, 2008 at 7:10 PM, Mike Klaas <mi...@gmail.com> wrote:
> >
> > On 19-Dec-08, at 8:27 AM, Kay Kay (JIRA) wrote:
> >>            int newCapacity = (oldCapacity * 3)/2 + 1;
> >>
> >> +1 seems to be move away from 0, and keep incrementing the count. ( Hmm
> ..
> >> That piece of code - in Java 6 ArrayList can definitely make use of
> bitwise
> >> operators for the div-by-2 operation !!).
> >
> > Let's not go crazy here guys.  This relatively trivial calculation is
> only
> > called log(n) times, and certainly uses bit ops after the jit gets its
> hands
> > on it.
>
> Log(n) point is well taken.
>
> The translation from "/2" to ">>1" can only really take place when the
> compiler knows it's unsigned.  This might be a simple enough case for
> the compiler to figure it out, but who knows... it took forever (late
> Java6) to generate native rotate instructions.
>
> Oh, and Kay, the simplest way to strength-reduce x*3/2 is (x+x/2) or
> (x+(x>>1))


I agree.  If we are doing any numeric calculations with the numbers
predetermined - we might as well help the compiler generate the most
efficient code, instead of human readable ease . (That can always go in the
comments ).



>
> -Yonik
>

Re: [jira] Commented: (SOLR-912) org.apache.solr.common.util.NamedList - Typesafe efficient variant - ModernNamedList introduced - implementing the same API as NamedList

Posted by Yonik Seeley <ys...@gmail.com>.
On Fri, Dec 19, 2008 at 7:10 PM, Mike Klaas <mi...@gmail.com> wrote:
>
> On 19-Dec-08, at 8:27 AM, Kay Kay (JIRA) wrote:
>>            int newCapacity = (oldCapacity * 3)/2 + 1;
>>
>> +1 seems to be move away from 0, and keep incrementing the count. ( Hmm ..
>> That piece of code - in Java 6 ArrayList can definitely make use of bitwise
>> operators for the div-by-2 operation !!).
>
> Let's not go crazy here guys.  This relatively trivial calculation is only
> called log(n) times, and certainly uses bit ops after the jit gets its hands
> on it.

Log(n) point is well taken.

The translation from "/2" to ">>1" can only really take place when the
compiler knows it's unsigned.  This might be a simple enough case for
the compiler to figure it out, but who knows... it took forever (late
Java6) to generate native rotate instructions.

Oh, and Kay, the simplest way to strength-reduce x*3/2 is (x+x/2) or (x+(x>>1))

-Yonik

Re: [jira] Commented: (SOLR-912) org.apache.solr.common.util.NamedList - Typesafe efficient variant - ModernNamedList introduced - implementing the same API as NamedList

Posted by Mike Klaas <mi...@gmail.com>.
On 19-Dec-08, at 8:27 AM, Kay Kay (JIRA) wrote:
>
> Meanwhile - w.r.t resize() - ( trade-off because increasing size a  
> lot would increase memory usage.  increase a size by a smaller  
> factor would be resulting in a more frequent increases in size). I  
> believe reading some theory that the ideal increase factor is  
> somewhere close to  ( 1 + 2^0.5) / 2  or something similar to that.

It should be benchmarked, but yes, a factor of two is typically more  
memory wasteful than the performance it gains (you have a 50% chance  
of wasting at least 1/4 of your memory, a 25% chance of wasting at  
least 3/8th, etc.)

> The method - ensureCapacity(capacity) in ArrayList (Java 6) also  
> seems to be a number along the lines ~ (1.5)
>
> 	    int newCapacity = (oldCapacity * 3)/2 + 1;
>
> +1 seems to be move away from 0, and keep incrementing the count.  
> ( Hmm .. That piece of code - in Java 6 ArrayList can definitely  
> make use of bitwise operators for the div-by-2 operation !!).

Let's not go crazy here guys.  This relatively trivial calculation is  
only called log(n) times, and certainly uses bit ops after the jit  
gets its hands on it.

-Mike

[jira] Commented: (SOLR-912) org.apache.solr.common.util.NamedList - Typesafe efficient variant - ModernNamedList introduced - implementing the same API as NamedList

Posted by "Kay Kay (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/SOLR-912?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12658114#action_12658114 ] 

Kay Kay commented on SOLR-912:
------------------------------

System.arrayCopy is great. It is bound to perform much better because of the native code for the same. 

Meanwhile - w.r.t resize() - ( trade-off because increasing size a lot would increase memory usage.  increase a size by a smaller factor would be resulting in a more frequent increases in size). I believe reading some theory that the ideal increase factor is somewhere close to  ( 1 + 2^0.5) / 2  or something similar to that. 


The method - ensureCapacity(capacity) in ArrayList (Java 6) also seems to be a number along the lines ~ (1.5)

	    int newCapacity = (oldCapacity * 3)/2 + 1; 

+1 seems to be move away from 0, and keep incrementing the count. ( Hmm .. That piece of code - in Java 6 ArrayList can definitely make use of bitwise operators for the div-by-2 operation !!).



> org.apache.solr.common.util.NamedList - Typesafe efficient variant - ModernNamedList introduced - implementing the same API as NamedList
> ----------------------------------------------------------------------------------------------------------------------------------------
>
>                 Key: SOLR-912
>                 URL: https://issues.apache.org/jira/browse/SOLR-912
>             Project: Solr
>          Issue Type: Improvement
>          Components: search
>    Affects Versions: 1.4
>         Environment: Tomcat 6, JRE 6, Solr 1.3+ nightlies 
>            Reporter: Kay Kay
>            Priority: Minor
>             Fix For: 1.4
>
>         Attachments: NLProfile.java, SOLR-912.patch
>
>   Original Estimate: 72h
>  Remaining Estimate: 72h
>
> The implementation of NamedList - while being fast - is not necessarily type-safe. I have implemented an additional implementation of the same - ModernNamedList (a type-safe variation providing the same interface as NamedList) - while preserving the semantics in terms of ordering of elements and allowing null elements for key and values (keys are always Strings , while values correspond to generics ). 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (SOLR-912) org.apache.solr.common.util.NamedList - Typesafe efficient variant - ModernNamedList introduced - implementing the same API as NamedList

Posted by "Noble Paul (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/SOLR-912?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12785818#action_12785818 ] 

Noble Paul commented on SOLR-912:
---------------------------------

Type safety is generally not required in NamedList. very often we use heterogeneous NamedList.

Creating an Entry Object per entry is memory inefficient compared to the existing one. 

type safety is there for the users of NamedList API even now. Internally how we manage it is not so important i feel



> org.apache.solr.common.util.NamedList - Typesafe efficient variant - ModernNamedList introduced - implementing the same API as NamedList
> ----------------------------------------------------------------------------------------------------------------------------------------
>
>                 Key: SOLR-912
>                 URL: https://issues.apache.org/jira/browse/SOLR-912
>             Project: Solr
>          Issue Type: Improvement
>          Components: search
>         Environment: Tomcat 6, JRE 6, Solr 1.3+ nightlies 
>            Reporter: Kay Kay
>            Priority: Minor
>         Attachments: NLProfile.java, SOLR-912.patch, SOLR-912.patch
>
>   Original Estimate: 72h
>  Remaining Estimate: 72h
>
> The implementation of NamedList - while being fast - is not necessarily type-safe. I have implemented an additional implementation of the same - ModernNamedList (a type-safe variation providing the same interface as NamedList) - while preserving the semantics in terms of ordering of elements and allowing null elements for key and values (keys are always Strings , while values correspond to generics ). 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (SOLR-912) org.apache.solr.common.util.NamedList - Typesafe efficient variant - ModernNamedList introduced - implementing the same API as NamedList

Posted by "Kay Kay (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/SOLR-912?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12786184#action_12786184 ] 

Kay Kay commented on SOLR-912:
------------------------------

{quote}
Type safety is generally not required in NamedList. very often we use heterogeneous NamedList.

Creating an Entry Object per entry is memory inefficient compared to the existing one.

type safety is there for the users of NamedList API even now. Internally how we manage it is not so important i feel
{quote}

The performance numbers in here say a different story.  The heterogenous NamedList data structure is not intuitive w.r.t code and performs poorly compared to the revised one as put in here. As regarding Entry Objects being a memory hog - do we have some stats to back it up. Otherwise it is premature to call that a memory optimization. 

> org.apache.solr.common.util.NamedList - Typesafe efficient variant - ModernNamedList introduced - implementing the same API as NamedList
> ----------------------------------------------------------------------------------------------------------------------------------------
>
>                 Key: SOLR-912
>                 URL: https://issues.apache.org/jira/browse/SOLR-912
>             Project: Solr
>          Issue Type: Improvement
>          Components: search
>         Environment: Tomcat 6, JRE 6, Solr 1.3+ nightlies 
>            Reporter: Kay Kay
>            Priority: Minor
>         Attachments: NLProfile.java, SOLR-912.patch, SOLR-912.patch
>
>   Original Estimate: 72h
>  Remaining Estimate: 72h
>
> The implementation of NamedList - while being fast - is not necessarily type-safe. I have implemented an additional implementation of the same - ModernNamedList (a type-safe variation providing the same interface as NamedList) - while preserving the semantics in terms of ordering of elements and allowing null elements for key and values (keys are always Strings , while values correspond to generics ). 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (SOLR-912) org.apache.solr.common.util.NamedList - Typesafe efficient variant - ModernNamedList introduced - implementing the same API as NamedList

Posted by "Kay Kay (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/SOLR-912?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12658251#action_12658251 ] 

Kay Kay commented on SOLR-912:
------------------------------

|  We also need to make sure we don't eliminate any public constructors, which seems to be the case based on my quick glance at the latest patch. 


<code>
-   public NamedList(List nameValuePairs) {
-   nvPairs=nameValuePairs;
+  protected NamedList(List<Map.Entry<String, T>> nameValuePairs) {
+    nvPairs = nameValuePairs;
-   }
</code>

As part of ensuring type-safety , the previous code had a heterogenous List ctor. as before.  I changed the access level and added another public ctor.  ( Object [] ) with deprecated tag to it so that people are still able to use the functionality. 

Otherwise - retaining the same signature after type safety would imply - people passing in a List of String-s and T-s , when the List expects Map.Entry<String , T > and would cause more confusion. 

Thanks to the erasure of generics , List and List<Map.Entry<String, T>> are all equal , not helping here. 
If backward compatibility is the key here-  I can revisit the patch again ensuring the same. 


| If there are performance gains to be had in the common case i'm all for it ... but i still feel like i'm not understanding the original goal: how does this approach give us more type safety? 

When I logged the issue - type-safety was the major reason behind the same. When I submitted by first patch and did the benchmarking - performance was also found to be a major constraint , (with incremental addition and creation of iterator objects).  NamedList seemed to be used all over the place. As long as we preserve the contract of the methods - this should definitely give an additional boost - since I discovered as part of profiling of the launch of SolrCore ( CoreContainer.Initializer.initalize() .. ) . 




> org.apache.solr.common.util.NamedList - Typesafe efficient variant - ModernNamedList introduced - implementing the same API as NamedList
> ----------------------------------------------------------------------------------------------------------------------------------------
>
>                 Key: SOLR-912
>                 URL: https://issues.apache.org/jira/browse/SOLR-912
>             Project: Solr
>          Issue Type: Improvement
>          Components: search
>    Affects Versions: 1.4
>         Environment: Tomcat 6, JRE 6, Solr 1.3+ nightlies 
>            Reporter: Kay Kay
>            Priority: Minor
>             Fix For: 1.4
>
>         Attachments: NLProfile.java, SOLR-912.patch, SOLR-912.patch
>
>   Original Estimate: 72h
>  Remaining Estimate: 72h
>
> The implementation of NamedList - while being fast - is not necessarily type-safe. I have implemented an additional implementation of the same - ModernNamedList (a type-safe variation providing the same interface as NamedList) - while preserving the semantics in terms of ordering of elements and allowing null elements for key and values (keys are always Strings , while values correspond to generics ). 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (SOLR-912) org.apache.solr.common.util.NamedList - Typesafe efficient variant - ModernNamedList introduced - implementing the same API as NamedList

Posted by "Chris A. Mattmann (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/SOLR-912?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12786224#action_12786224 ] 

Chris A. Mattmann commented on SOLR-912:
----------------------------------------

bq. The heterogenous NamedList data structure is not intuitive w.r.t code and performs poorly compared to the revised one as put in here.

+1 to this. NamedList is not a very intuitive structure at all, which I remarked on SOLR-1516.

Cheers,
Chris


> org.apache.solr.common.util.NamedList - Typesafe efficient variant - ModernNamedList introduced - implementing the same API as NamedList
> ----------------------------------------------------------------------------------------------------------------------------------------
>
>                 Key: SOLR-912
>                 URL: https://issues.apache.org/jira/browse/SOLR-912
>             Project: Solr
>          Issue Type: Improvement
>          Components: search
>         Environment: Tomcat 6, JRE 6, Solr 1.3+ nightlies 
>            Reporter: Kay Kay
>            Priority: Minor
>         Attachments: NLProfile.java, SOLR-912.patch, SOLR-912.patch
>
>   Original Estimate: 72h
>  Remaining Estimate: 72h
>
> The implementation of NamedList - while being fast - is not necessarily type-safe. I have implemented an additional implementation of the same - ModernNamedList (a type-safe variation providing the same interface as NamedList) - while preserving the semantics in terms of ordering of elements and allowing null elements for key and values (keys are always Strings , while values correspond to generics ). 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (SOLR-912) org.apache.solr.common.util.NamedList - Typesafe efficient variant - ModernNamedList introduced - implementing the same API as NamedList

Posted by "Hoss Man (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/SOLR-912?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12664760#action_12664760 ] 

Hoss Man commented on SOLR-912:
-------------------------------

Kay: It occurs to me that one thing we could do is invert the nature of your patch:
# add an "Entry<String, T>[]" constructor (with a warning, but no commitment, that modifying the array conents _may_ affect the NamedList) which builds up a pairwise List<Object> and delegates to the existing "List" constructor
# deprecate the existing List constructor.  

Anyone using the new constructor will get the type safety benefits (not entirely enforced by the compiler, but enforced by the contract) and at some later date (Solr 2?) we can remove the "List" constructor and replace the guts of the class with your approach (to get the perf improvements)

thoughts?

> org.apache.solr.common.util.NamedList - Typesafe efficient variant - ModernNamedList introduced - implementing the same API as NamedList
> ----------------------------------------------------------------------------------------------------------------------------------------
>
>                 Key: SOLR-912
>                 URL: https://issues.apache.org/jira/browse/SOLR-912
>             Project: Solr
>          Issue Type: Improvement
>          Components: search
>    Affects Versions: 1.4
>         Environment: Tomcat 6, JRE 6, Solr 1.3+ nightlies 
>            Reporter: Kay Kay
>            Priority: Minor
>             Fix For: 1.4
>
>         Attachments: NLProfile.java, SOLR-912.patch, SOLR-912.patch
>
>   Original Estimate: 72h
>  Remaining Estimate: 72h
>
> The implementation of NamedList - while being fast - is not necessarily type-safe. I have implemented an additional implementation of the same - ModernNamedList (a type-safe variation providing the same interface as NamedList) - while preserving the semantics in terms of ordering of elements and allowing null elements for key and values (keys are always Strings , while values correspond to generics ). 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Closed: (SOLR-912) org.apache.solr.common.util.NamedList - Typesafe efficient variant - ModernNamedList introduced - implementing the same API as NamedList

Posted by "Kay Kay (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/SOLR-912?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Kay Kay closed SOLR-912.
------------------------


> org.apache.solr.common.util.NamedList - Typesafe efficient variant - ModernNamedList introduced - implementing the same API as NamedList
> ----------------------------------------------------------------------------------------------------------------------------------------
>
>                 Key: SOLR-912
>                 URL: https://issues.apache.org/jira/browse/SOLR-912
>             Project: Solr
>          Issue Type: Improvement
>          Components: search
>    Affects Versions: 1.4
>         Environment: Tomcat 6, JRE 6, Solr 1.3+ nightlies 
>            Reporter: Kay Kay
>            Priority: Minor
>             Fix For: 1.4
>
>         Attachments: NLProfile.java, SOLR-912.patch, SOLR-912.patch
>
>   Original Estimate: 72h
>  Remaining Estimate: 72h
>
> The implementation of NamedList - while being fast - is not necessarily type-safe. I have implemented an additional implementation of the same - ModernNamedList (a type-safe variation providing the same interface as NamedList) - while preserving the semantics in terms of ordering of elements and allowing null elements for key and values (keys are always Strings , while values correspond to generics ). 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (SOLR-912) org.apache.solr.common.util.NamedList - Typesafe efficient variant - ModernNamedList introduced - implementing the same API as NamedList

Posted by "Kay Kay (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/SOLR-912?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12785810#action_12785810 ] 

Kay Kay commented on SOLR-912:
------------------------------

So - what is the current status of this. 

At what time are we planning to deprecate the old implementation and incorporate this new d.s ( changing the guts of the internals ). 

> org.apache.solr.common.util.NamedList - Typesafe efficient variant - ModernNamedList introduced - implementing the same API as NamedList
> ----------------------------------------------------------------------------------------------------------------------------------------
>
>                 Key: SOLR-912
>                 URL: https://issues.apache.org/jira/browse/SOLR-912
>             Project: Solr
>          Issue Type: Improvement
>          Components: search
>         Environment: Tomcat 6, JRE 6, Solr 1.3+ nightlies 
>            Reporter: Kay Kay
>            Priority: Minor
>         Attachments: NLProfile.java, SOLR-912.patch, SOLR-912.patch
>
>   Original Estimate: 72h
>  Remaining Estimate: 72h
>
> The implementation of NamedList - while being fast - is not necessarily type-safe. I have implemented an additional implementation of the same - ModernNamedList (a type-safe variation providing the same interface as NamedList) - while preserving the semantics in terms of ordering of elements and allowing null elements for key and values (keys are always Strings , while values correspond to generics ). 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (SOLR-912) org.apache.solr.common.util.NamedList - Typesafe efficient variant - ModernNamedList introduced - implementing the same API as NamedList

Posted by "Kay Kay (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/SOLR-912?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12656408#action_12656408 ] 

Kay Kay commented on SOLR-912:
------------------------------

The interface is mostly used as a proof of concept that the new class - ModernNamedList implements the same methods as that of NamedList. Eventually - the interface would be gotten rid of , once we are happy with the same. 

As far as NamedList , I believe if we want to have the flexibility of allowing any type in it - we might as well define it to an Object. If we do qualify it as a specific type - then we might as well implement type-safety in the class. javac does not complain today because the compiler switch to indicate type-safety errors has been turned off.

Previous implementation used to be having a pre-jdk5 List , with members being String and a type depending on the index. The revised implementation has Map.Entry<String, T> interface - which is directly intuitive to what is required ( a Map with order being preserved , allowing duplicates and nulls ).  I did profile with 2 different implementations , involving Map.Entry<?> and a heterogenous list with String and a type (with insertion / deletion of 100,000 records). The current implementation in fact , failed in the performance comparison in both insertion / deletion in the middle of the List ( remove () ) , since we have to add/remove elements twice from the List (as in the current impl) , as compared to 1 insertion/deletion in the Map.Entry<> implementation. ) Given that addition/deletion in the List is worst-case linear - I believe the perceived performance degradation due to additional object , turns out to be not so bad when compared to 2-path insertion / deletion as we have today. 

NamedList does seem to implement the interface Iterable<T> . I am not sure how the consumer of the API can have independent iterators (since only NamedList is supposed to be aware of the internal data structures and not the consumer). So I believe it would be upto NamedList<T> to provide an iterator to the user of the API. 

> org.apache.solr.common.util.NamedList - Typesafe efficient variant - ModernNamedList introduced - implementing the same API as NamedList
> ----------------------------------------------------------------------------------------------------------------------------------------
>
>                 Key: SOLR-912
>                 URL: https://issues.apache.org/jira/browse/SOLR-912
>             Project: Solr
>          Issue Type: Improvement
>          Components: clients - java
>    Affects Versions: 1.4
>         Environment: Tomcat 6, JRE 6, Solr 1.3+ nightlies 
>            Reporter: Kay Kay
>            Priority: Minor
>             Fix For: 1.3.1
>
>         Attachments: SOLR-912.patch
>
>   Original Estimate: 72h
>  Remaining Estimate: 72h
>
> The implementation of NamedList - while being fast - is not necessarily type-safe. I have implemented an additional implementation of the same - ModernNamedList (a type-safe variation providing the same interface as NamedList) - while preserving the semantics in terms of ordering of elements and allowing null elements for key and values (keys are always Strings , while values correspond to generics ). 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (SOLR-912) org.apache.solr.common.util.NamedList - Typesafe efficient variant - ModernNamedList introduced - implementing the same API as NamedList

Posted by "Hoss Man (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/SOLR-912?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Hoss Man updated SOLR-912:
--------------------------

    Affects Version/s:     (was: 1.4)
        Fix Version/s:     (was: 1.4)

> org.apache.solr.common.util.NamedList - Typesafe efficient variant - ModernNamedList introduced - implementing the same API as NamedList
> ----------------------------------------------------------------------------------------------------------------------------------------
>
>                 Key: SOLR-912
>                 URL: https://issues.apache.org/jira/browse/SOLR-912
>             Project: Solr
>          Issue Type: Improvement
>          Components: search
>         Environment: Tomcat 6, JRE 6, Solr 1.3+ nightlies 
>            Reporter: Kay Kay
>            Priority: Minor
>         Attachments: NLProfile.java, SOLR-912.patch, SOLR-912.patch
>
>   Original Estimate: 72h
>  Remaining Estimate: 72h
>
> The implementation of NamedList - while being fast - is not necessarily type-safe. I have implemented an additional implementation of the same - ModernNamedList (a type-safe variation providing the same interface as NamedList) - while preserving the semantics in terms of ordering of elements and allowing null elements for key and values (keys are always Strings , while values correspond to generics ). 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (SOLR-912) org.apache.solr.common.util.NamedList - Typesafe efficient variant - ModernNamedList introduced - implementing the same API as NamedList

Posted by "Noble Paul (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/SOLR-912?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12786348#action_12786348 ] 

Noble Paul commented on SOLR-912:
---------------------------------

bq.The performance numbers in here say a different story

I'm not referring to perf numbers here . It is memory efficiency. 

bq. As regarding Entry Objects being a memory hog - do we have some stats to back it up. 

We don't need stats for everything. we should know about how VM holds objects . 

Let me illustrate with a case of consider 5 key->values on a 32 bit m/c

NamedList(Backed by arraylist)
one Object []  + array size= 4 + 5 * 2*4 (bytes)  = 44 bytes + the overhead of ArrayList

ModernNamedList

one Object[] + 5 entry objects (each has 2 references of 4+4 bytes)+ array size () = 4 +  5*2*4 + 5*4  = 64 bytes+ the overhead of ArrayList   

Add to this the overhead of GC'ing 5 entry objects 













> org.apache.solr.common.util.NamedList - Typesafe efficient variant - ModernNamedList introduced - implementing the same API as NamedList
> ----------------------------------------------------------------------------------------------------------------------------------------
>
>                 Key: SOLR-912
>                 URL: https://issues.apache.org/jira/browse/SOLR-912
>             Project: Solr
>          Issue Type: Improvement
>          Components: search
>         Environment: Tomcat 6, JRE 6, Solr 1.3+ nightlies 
>            Reporter: Kay Kay
>            Priority: Minor
>         Attachments: NLProfile.java, SOLR-912.patch, SOLR-912.patch
>
>   Original Estimate: 72h
>  Remaining Estimate: 72h
>
> The implementation of NamedList - while being fast - is not necessarily type-safe. I have implemented an additional implementation of the same - ModernNamedList (a type-safe variation providing the same interface as NamedList) - while preserving the semantics in terms of ordering of elements and allowing null elements for key and values (keys are always Strings , while values correspond to generics ). 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (SOLR-912) org.apache.solr.common.util.NamedList - Typesafe efficient variant - ModernNamedList introduced - implementing the same API as NamedList

Posted by "Kay Kay (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/SOLR-912?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12657627#action_12657627 ] 

Kay Kay commented on SOLR-912:
------------------------------

| ModernNamedList is being suggested as an alternate implementation of NamedList ... ideally the internals of NamedLIst would be replaced with the internals of ModernNamedList, but in this patch they are seperate classes so they can be compared.

| INamedList is included in the patch as a way to demonstrate that ModernNamedList fulfills the same contract as NamedList (for the purposes of testing etc)

True. 

Attached herewith is:  NLProfile.java - that contains sample benchmarking against the 2 implementations (will work with the previous page on the page). 

Some results: 

addAll / getAll():   increase in performance is almost [1-10]% range. 

add: increase in performance by around 30% , probably because of the additional growth in the List implementation when size approaches capacity. And since, in NamedList - we insert 2 elements as opposed to one, ( as done in ModernNamedList) - it might be more pronounced. 


iterator:   ~70% increase in performance in favor of the new implementation since it just reuses the iterator for the internal data structure. 

The numbers should be present as comments in the corresponding methods - testAdd() , testAddAll(), testGetAll(), testIterator() . 

I will attach the final patch once we are convinced with the  benchmark methodology and the numbers. 


> org.apache.solr.common.util.NamedList - Typesafe efficient variant - ModernNamedList introduced - implementing the same API as NamedList
> ----------------------------------------------------------------------------------------------------------------------------------------
>
>                 Key: SOLR-912
>                 URL: https://issues.apache.org/jira/browse/SOLR-912
>             Project: Solr
>          Issue Type: Improvement
>          Components: Analysis
>    Affects Versions: 1.4
>         Environment: Tomcat 6, JRE 6, Solr 1.3+ nightlies 
>            Reporter: Kay Kay
>            Priority: Minor
>             Fix For: 1.3.1
>
>         Attachments: NLProfile.java, SOLR-912.patch
>
>   Original Estimate: 72h
>  Remaining Estimate: 72h
>
> The implementation of NamedList - while being fast - is not necessarily type-safe. I have implemented an additional implementation of the same - ModernNamedList (a type-safe variation providing the same interface as NamedList) - while preserving the semantics in terms of ordering of elements and allowing null elements for key and values (keys are always Strings , while values correspond to generics ). 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (SOLR-912) org.apache.solr.common.util.NamedList - Typesafe efficient variant - ModernNamedList introduced - implementing the same API as NamedList

Posted by "Hoss Man (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/SOLR-912?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12658249#action_12658249 ] 

Hoss Man commented on SOLR-912:
-------------------------------

If there are performance gains to be had in the common case i'm all for it ... but i still feel like i'm not understanding the original goal: how does this approach give us more type safety?

We also need to make sure we don't eliminate any public constructors, which seems to be the case based on my quick glance at the latest patch. 

> org.apache.solr.common.util.NamedList - Typesafe efficient variant - ModernNamedList introduced - implementing the same API as NamedList
> ----------------------------------------------------------------------------------------------------------------------------------------
>
>                 Key: SOLR-912
>                 URL: https://issues.apache.org/jira/browse/SOLR-912
>             Project: Solr
>          Issue Type: Improvement
>          Components: search
>    Affects Versions: 1.4
>         Environment: Tomcat 6, JRE 6, Solr 1.3+ nightlies 
>            Reporter: Kay Kay
>            Priority: Minor
>             Fix For: 1.4
>
>         Attachments: NLProfile.java, SOLR-912.patch, SOLR-912.patch
>
>   Original Estimate: 72h
>  Remaining Estimate: 72h
>
> The implementation of NamedList - while being fast - is not necessarily type-safe. I have implemented an additional implementation of the same - ModernNamedList (a type-safe variation providing the same interface as NamedList) - while preserving the semantics in terms of ordering of elements and allowing null elements for key and values (keys are always Strings , while values correspond to generics ). 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (SOLR-912) org.apache.solr.common.util.NamedList - Typesafe efficient variant - ModernNamedList introduced - implementing the same API as NamedList

Posted by "Shalin Shekhar Mangar (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/SOLR-912?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Shalin Shekhar Mangar updated SOLR-912:
---------------------------------------

      Component/s:     (was: Analysis)
                   search
    Fix Version/s:     (was: 1.3.1)
                   1.4

> org.apache.solr.common.util.NamedList - Typesafe efficient variant - ModernNamedList introduced - implementing the same API as NamedList
> ----------------------------------------------------------------------------------------------------------------------------------------
>
>                 Key: SOLR-912
>                 URL: https://issues.apache.org/jira/browse/SOLR-912
>             Project: Solr
>          Issue Type: Improvement
>          Components: search
>    Affects Versions: 1.4
>         Environment: Tomcat 6, JRE 6, Solr 1.3+ nightlies 
>            Reporter: Kay Kay
>            Priority: Minor
>             Fix For: 1.4
>
>         Attachments: NLProfile.java, SOLR-912.patch
>
>   Original Estimate: 72h
>  Remaining Estimate: 72h
>
> The implementation of NamedList - while being fast - is not necessarily type-safe. I have implemented an additional implementation of the same - ModernNamedList (a type-safe variation providing the same interface as NamedList) - while preserving the semantics in terms of ordering of elements and allowing null elements for key and values (keys are always Strings , while values correspond to generics ). 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (SOLR-912) org.apache.solr.common.util.NamedList - Typesafe efficient variant - ModernNamedList introduced - implementing the same API as NamedList

Posted by "Hoss Man (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/SOLR-912?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12657299#action_12657299 ] 

Hoss Man commented on SOLR-912:
-------------------------------

If i'm understanding the discussion so far...

* ModernNamedList is being suggested as an alternate implementation of NamedList ... ideally the internals of NamedLIst would be replaced with the internals of ModernNamedList, but in this patch they are seperate classes so they can be compared.
* INamedList is included in the patch as a way to demonstrate that ModernNamedList fulfills the same contract as NamedList (for the purposes of testing etc)

do i have those aspects correct?

with that in mind: i'm not sure i understand what "itch" changing the implementation "scratches" ... the initial issue description says it's because NamedList " is not necessarily type-safe" but it's not clear what that statement is referring to ... later comments suggest that the motivation is to improve the performance of "remove" ... which hardly seems like something worth optimizing for.

I agree that having the internals based on a "list of pairs" certainly seems like it might be more intuitive to developers looking at the internals (then the current approach is), but how is the current approach less type safe for consumers using just the NamedList API?

If the "modern" approach is more performant then the existing impl and passes all of the tests then i suppose it would make sense to switch -- but i'm far more interested in how the performance compares for common cases (add/get/iterate) then for cases that hardly ever come up (remove).

My suggestion: provide two independent attachments.  One patch that just replaces the internals of NamedList with the approach you suggest so people can apply the patch, test it out, and verify the API/behavior; A second attachment that provides some benchmarks against the NmaedList class -- so people can read/run your benchmark with and with out the patch to see how the performance changes.


> org.apache.solr.common.util.NamedList - Typesafe efficient variant - ModernNamedList introduced - implementing the same API as NamedList
> ----------------------------------------------------------------------------------------------------------------------------------------
>
>                 Key: SOLR-912
>                 URL: https://issues.apache.org/jira/browse/SOLR-912
>             Project: Solr
>          Issue Type: Improvement
>          Components: Analysis
>    Affects Versions: 1.4
>         Environment: Tomcat 6, JRE 6, Solr 1.3+ nightlies 
>            Reporter: Kay Kay
>            Priority: Minor
>             Fix For: 1.3.1
>
>         Attachments: SOLR-912.patch
>
>   Original Estimate: 72h
>  Remaining Estimate: 72h
>
> The implementation of NamedList - while being fast - is not necessarily type-safe. I have implemented an additional implementation of the same - ModernNamedList (a type-safe variation providing the same interface as NamedList) - while preserving the semantics in terms of ordering of elements and allowing null elements for key and values (keys are always Strings , while values correspond to generics ). 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (SOLR-912) org.apache.solr.common.util.NamedList - Typesafe efficient variant - ModernNamedList introduced - implementing the same API as NamedList

Posted by "Noble Paul (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/SOLR-912?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12656381#action_12656381 ] 

Noble Paul commented on SOLR-912:
---------------------------------

Type safety is not an overriding concern when a NamedList is used (that is the beauty of it). It does not help in any way. Most of the usages of NamedList involves heterogeneous values . 

your implementation is not as efficient (memory usage) as the original one

The idea of having an interface is an overkill 

-1

> org.apache.solr.common.util.NamedList - Typesafe efficient variant - ModernNamedList introduced - implementing the same API as NamedList
> ----------------------------------------------------------------------------------------------------------------------------------------
>
>                 Key: SOLR-912
>                 URL: https://issues.apache.org/jira/browse/SOLR-912
>             Project: Solr
>          Issue Type: Improvement
>          Components: clients - java
>    Affects Versions: 1.4
>         Environment: Tomcat 6, JRE 6, Solr 1.3+ nightlies 
>            Reporter: Kay Kay
>            Priority: Minor
>             Fix For: 1.3.1
>
>         Attachments: SOLR-912.patch
>
>   Original Estimate: 72h
>  Remaining Estimate: 72h
>
> The implementation of NamedList - while being fast - is not necessarily type-safe. I have implemented an additional implementation of the same - ModernNamedList (a type-safe variation providing the same interface as NamedList) - while preserving the semantics in terms of ordering of elements and allowing null elements for key and values (keys are always Strings , while values correspond to generics ). 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (SOLR-912) org.apache.solr.common.util.NamedList - Typesafe efficient variant - ModernNamedList introduced - implementing the same API as NamedList

Posted by "Kay Kay (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/SOLR-912?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Kay Kay updated SOLR-912:
-------------------------

    Attachment: SOLR-912.patch

1) New Type-safe implementation of NamedList , ModernNamedList . 

2) New interface INamedList<T> created 

3) New Test case - ModernNamedList added. 

4) Added more test  cases to NamedListTest . 

Once the patch is approved - NamedList would be deprecated and the existing codebase in Solr would be replaced to use ModernNamedList<T> to be more type-safe. 

> org.apache.solr.common.util.NamedList - Typesafe efficient variant - ModernNamedList introduced - implementing the same API as NamedList
> ----------------------------------------------------------------------------------------------------------------------------------------
>
>                 Key: SOLR-912
>                 URL: https://issues.apache.org/jira/browse/SOLR-912
>             Project: Solr
>          Issue Type: Improvement
>          Components: clients - java
>    Affects Versions: 1.4
>         Environment: Tomcat 6, JRE 6, Solr 1.3+ nightlies 
>            Reporter: Kay Kay
>            Priority: Minor
>             Fix For: 1.3.1
>
>         Attachments: SOLR-912.patch
>
>   Original Estimate: 72h
>  Remaining Estimate: 72h
>
> The implementation of NamedList - while being fast - is not necessarily type-safe. I have implemented an additional implementation of the same - ModernNamedList (a type-safe variation providing the same interface as NamedList) - while preserving the semantics in terms of ordering of elements and allowing null elements for key and values (keys are always Strings , while values correspond to generics ). 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (SOLR-912) org.apache.solr.common.util.NamedList - Typesafe efficient variant - ModernNamedList introduced - implementing the same API as NamedList

Posted by "Kay Kay (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/SOLR-912?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12665163#action_12665163 ] 

Kay Kay commented on SOLR-912:
------------------------------

Logged a separate issue - SOLR-967 for the temporary fix and tracking progress on the same.  This issue changes the internals of the class that would be revisited after adopting a proper migration path , per SOLR-967 . 


> org.apache.solr.common.util.NamedList - Typesafe efficient variant - ModernNamedList introduced - implementing the same API as NamedList
> ----------------------------------------------------------------------------------------------------------------------------------------
>
>                 Key: SOLR-912
>                 URL: https://issues.apache.org/jira/browse/SOLR-912
>             Project: Solr
>          Issue Type: Improvement
>          Components: search
>    Affects Versions: 1.4
>         Environment: Tomcat 6, JRE 6, Solr 1.3+ nightlies 
>            Reporter: Kay Kay
>            Priority: Minor
>             Fix For: 1.4
>
>         Attachments: NLProfile.java, SOLR-912.patch, SOLR-912.patch
>
>   Original Estimate: 72h
>  Remaining Estimate: 72h
>
> The implementation of NamedList - while being fast - is not necessarily type-safe. I have implemented an additional implementation of the same - ModernNamedList (a type-safe variation providing the same interface as NamedList) - while preserving the semantics in terms of ordering of elements and allowing null elements for key and values (keys are always Strings , while values correspond to generics ). 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (SOLR-912) org.apache.solr.common.util.NamedList - Typesafe efficient variant - ModernNamedList introduced - implementing the same API as NamedList

Posted by "Hoss Man (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/SOLR-912?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12663998#action_12663998 ] 

Hoss Man commented on SOLR-912:
-------------------------------

Ok, Now I think I understand your type safety goal -- the existing implementation is type safe if-and-only-if the precondition of the "List" constructor is met (that it contains pairwise names/values) ... your goal is to make NamedList garuntee type correctness. (correct?)

bq. If backward compatibility is the key here- I can revisit the patch again ensuring the same.

There are a lot of internal APIs where I wouldn't be opposed to fudging backwards compatibility in the interests of better code, but NamedList isn't one of them.  it's *the* datastructure that gets used by almost any type of plugin people may write -- any request handler or search component that wants to add data to the response is going to be constructing a new NamedList, so I'm definitely not on board breaking things for all of those people.

unfortunately, i think your goal is in direct and inherent opposition to backwards compatiblity.

As you mentioned, type erasure prevents us from adding a new "List<Entry<String, T>>" constructor while keeping the existing "List" constructor -- but that's not the biggest problem.  We could always use tricks (like adding an extra ignored arg to the new constructor, or making the new constructor take in an "Entry<String, T>[]" instead of a List) to maintain binary API compatibility, and then have the legacy constructor cast the List elements as needed to delegate to the new constructor .... *EXCEPT* .... binary API compatibilty is only part of backwards compatibility.   The bigger problem is this sentence in the javadocs...

{noformat}
   * @param nameValuePairs underlying List which should be used to implement a NamedList; modifying this List will affect the NamedList.
   */
  public NamedList(List nameValuePairs) {
{noformat}

...it would be nice if that was just an implementation detail, and the javadocs said "... _may_ affect the NamedList", but it says "... _will_ affect the NamedList"  Changing the internals (and that constructor) would change the behavior our from under people who have an existing expectation that they can maintain a refrence to the List and modify it to affect the NamedList.  Unfortunately this isn't an academic point: I've actually seen this utilized in plugin code. People build up datastructures containing NamedLists, and then data is added to the underlying Lists backing those NamedLists after the fact (but before the NamedList is iterated by a response writer)

I just don't see any way to feasibly achieve your type-safe constructor goal while maintaining back-compat.

> org.apache.solr.common.util.NamedList - Typesafe efficient variant - ModernNamedList introduced - implementing the same API as NamedList
> ----------------------------------------------------------------------------------------------------------------------------------------
>
>                 Key: SOLR-912
>                 URL: https://issues.apache.org/jira/browse/SOLR-912
>             Project: Solr
>          Issue Type: Improvement
>          Components: search
>    Affects Versions: 1.4
>         Environment: Tomcat 6, JRE 6, Solr 1.3+ nightlies 
>            Reporter: Kay Kay
>            Priority: Minor
>             Fix For: 1.4
>
>         Attachments: NLProfile.java, SOLR-912.patch, SOLR-912.patch
>
>   Original Estimate: 72h
>  Remaining Estimate: 72h
>
> The implementation of NamedList - while being fast - is not necessarily type-safe. I have implemented an additional implementation of the same - ModernNamedList (a type-safe variation providing the same interface as NamedList) - while preserving the semantics in terms of ordering of elements and allowing null elements for key and values (keys are always Strings , while values correspond to generics ). 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (SOLR-912) org.apache.solr.common.util.NamedList - Typesafe efficient variant - ModernNamedList introduced - implementing the same API as NamedList

Posted by "Kay Kay (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/SOLR-912?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Kay Kay updated SOLR-912:
-------------------------

    Attachment: NLProfile.java

a sample benchmarking program that works with the previous patch submitted. 

> org.apache.solr.common.util.NamedList - Typesafe efficient variant - ModernNamedList introduced - implementing the same API as NamedList
> ----------------------------------------------------------------------------------------------------------------------------------------
>
>                 Key: SOLR-912
>                 URL: https://issues.apache.org/jira/browse/SOLR-912
>             Project: Solr
>          Issue Type: Improvement
>          Components: Analysis
>    Affects Versions: 1.4
>         Environment: Tomcat 6, JRE 6, Solr 1.3+ nightlies 
>            Reporter: Kay Kay
>            Priority: Minor
>             Fix For: 1.3.1
>
>         Attachments: NLProfile.java, SOLR-912.patch
>
>   Original Estimate: 72h
>  Remaining Estimate: 72h
>
> The implementation of NamedList - while being fast - is not necessarily type-safe. I have implemented an additional implementation of the same - ModernNamedList (a type-safe variation providing the same interface as NamedList) - while preserving the semantics in terms of ordering of elements and allowing null elements for key and values (keys are always Strings , while values correspond to generics ). 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Issue Comment Edited: (SOLR-912) org.apache.solr.common.util.NamedList - Typesafe efficient variant - ModernNamedList introduced - implementing the same API as NamedList

Posted by "Noble Paul (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/SOLR-912?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12786348#action_12786348 ] 

Noble Paul edited comment on SOLR-912 at 12/5/09 6:34 AM:
----------------------------------------------------------

bq.The performance numbers in here say a different story

I'm not referring to perf numbers here . It is memory efficiency. 

bq. As regarding Entry Objects being a memory hog - do we have some stats to back it up. 

We don't need stats for everything. we should know about how VM holds objects . 

Let me illustrate with a case of consider 5 key->values on a 32 bit m/c

NamedList(Backed by arraylist)
one Object []  + array size= 4 + 5 * 2*4 (bytes)  = 44 bytes + the overhead of ArrayList

ModernNamedList

one Object[] + 5 entry objects (16 bytes object overhead + each has 2 references of 4+4 bytes)+ array size () = 4 + 16*5+ 5*2*4 + 5*4  = 144 bytes+ the overhead of ArrayList   

Add to this the overhead of GC'ing 5 entry objects 

reference : http://www.cs.virginia.edu/kim/publicity/pldi09tutorials/memory-efficient-java-tutorial.pdf













      was (Author: noble.paul):
    bq.The performance numbers in here say a different story

I'm not referring to perf numbers here . It is memory efficiency. 

bq. As regarding Entry Objects being a memory hog - do we have some stats to back it up. 

We don't need stats for everything. we should know about how VM holds objects . 

Let me illustrate with a case of consider 5 key->values on a 32 bit m/c

NamedList(Backed by arraylist)
one Object []  + array size= 4 + 5 * 2*4 (bytes)  = 44 bytes + the overhead of ArrayList

ModernNamedList

one Object[] + 5 entry objects (each has 2 references of 4+4 bytes)+ array size () = 4 +  5*2*4 + 5*4  = 64 bytes+ the overhead of ArrayList   

Add to this the overhead of GC'ing 5 entry objects 












  
> org.apache.solr.common.util.NamedList - Typesafe efficient variant - ModernNamedList introduced - implementing the same API as NamedList
> ----------------------------------------------------------------------------------------------------------------------------------------
>
>                 Key: SOLR-912
>                 URL: https://issues.apache.org/jira/browse/SOLR-912
>             Project: Solr
>          Issue Type: Improvement
>          Components: search
>         Environment: Tomcat 6, JRE 6, Solr 1.3+ nightlies 
>            Reporter: Kay Kay
>            Priority: Minor
>         Attachments: NLProfile.java, SOLR-912.patch, SOLR-912.patch
>
>   Original Estimate: 72h
>  Remaining Estimate: 72h
>
> The implementation of NamedList - while being fast - is not necessarily type-safe. I have implemented an additional implementation of the same - ModernNamedList (a type-safe variation providing the same interface as NamedList) - while preserving the semantics in terms of ordering of elements and allowing null elements for key and values (keys are always Strings , while values correspond to generics ). 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (SOLR-912) org.apache.solr.common.util.NamedList - Typesafe efficient variant - ModernNamedList introduced - implementing the same API as NamedList

Posted by "Yonik Seeley (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/SOLR-912?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12657692#action_12657692 ] 

Yonik Seeley commented on SOLR-912:
-----------------------------------

While we're going down the micro-benchmarking path, I tried eliminating ArrayList and got an additional 15-25% gain on common operations (create new, add between 5 and 15 elements, and then iterate over those elements later).  This was with Java 1.6.  -Xbatch improved the results even more... ~40% - but this is just a micro-benchmark.

{code}
class NamedList2<T> implements INamedList<T> {

  protected NamedListEntry<T>[] nvPairs;
  protected int size;

  public NamedList2() {
    nvPairs = new NamedListEntry[10];
    size = 0;
  }

  @Override
  public int size() {
    return size;
  }

  @Override
  public String getName(int idx) {
    if (idx >= size) throw new ArrayIndexOutOfBoundsException();
    return nvPairs[idx].key;
  }

  @Override
  public T getVal(int idx) {
    if (idx >= size) throw new ArrayIndexOutOfBoundsException();    
    return nvPairs[idx].value;
  }

  private void resize() {
    NamedListEntry<T>[] arr = new NamedListEntry[nvPairs.length << 1];
    System.arraycopy(nvPairs, 0, arr, 0, size);
    nvPairs = arr;
  }

  @Override
  public void add(String name, T val) {
    if (size >= nvPairs.length) {
      resize();
    }
    nvPairs[size++] = new NamedListEntry<T>(name, val);
  }

[...]
{code}

> org.apache.solr.common.util.NamedList - Typesafe efficient variant - ModernNamedList introduced - implementing the same API as NamedList
> ----------------------------------------------------------------------------------------------------------------------------------------
>
>                 Key: SOLR-912
>                 URL: https://issues.apache.org/jira/browse/SOLR-912
>             Project: Solr
>          Issue Type: Improvement
>          Components: Analysis
>    Affects Versions: 1.4
>         Environment: Tomcat 6, JRE 6, Solr 1.3+ nightlies 
>            Reporter: Kay Kay
>            Priority: Minor
>             Fix For: 1.3.1
>
>         Attachments: NLProfile.java, SOLR-912.patch
>
>   Original Estimate: 72h
>  Remaining Estimate: 72h
>
> The implementation of NamedList - while being fast - is not necessarily type-safe. I have implemented an additional implementation of the same - ModernNamedList (a type-safe variation providing the same interface as NamedList) - while preserving the semantics in terms of ordering of elements and allowing null elements for key and values (keys are always Strings , while values correspond to generics ). 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (SOLR-912) org.apache.solr.common.util.NamedList - Typesafe efficient variant - ModernNamedList introduced - implementing the same API as NamedList

Posted by "Kay Kay (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/SOLR-912?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12656390#action_12656390 ] 

Kay Kay commented on SOLR-912:
------------------------------

The interface is present only to enable the migration from NamedList (legacy) to the new one. (with similar properties of Cloneable, Serializable etc. ). 

If type-safety is not a concern - is there a reason why NamedList<T> is defined as a generic type. We could probably define it as NamedList , with T replaced to be an object internally. not making it a generic type. 

There seems to be no memory leaks as far as the container is concerned.  Creating an iterator object for every call to an iterator seems to be quite a bit of data redundancy issues when ideally we can use the iterator of one of the underlying objects as well. 

> org.apache.solr.common.util.NamedList - Typesafe efficient variant - ModernNamedList introduced - implementing the same API as NamedList
> ----------------------------------------------------------------------------------------------------------------------------------------
>
>                 Key: SOLR-912
>                 URL: https://issues.apache.org/jira/browse/SOLR-912
>             Project: Solr
>          Issue Type: Improvement
>          Components: clients - java
>    Affects Versions: 1.4
>         Environment: Tomcat 6, JRE 6, Solr 1.3+ nightlies 
>            Reporter: Kay Kay
>            Priority: Minor
>             Fix For: 1.3.1
>
>         Attachments: SOLR-912.patch
>
>   Original Estimate: 72h
>  Remaining Estimate: 72h
>
> The implementation of NamedList - while being fast - is not necessarily type-safe. I have implemented an additional implementation of the same - ModernNamedList (a type-safe variation providing the same interface as NamedList) - while preserving the semantics in terms of ordering of elements and allowing null elements for key and values (keys are always Strings , while values correspond to generics ). 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Resolved: (SOLR-912) org.apache.solr.common.util.NamedList - Typesafe efficient variant - ModernNamedList introduced - implementing the same API as NamedList

Posted by "Kay Kay (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/SOLR-912?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Kay Kay resolved SOLR-912.
--------------------------

    Resolution: Won't Fix

Agreed.  This patch is too late in the game w.r.t type safety to change the underlying behavior at this point. 

> org.apache.solr.common.util.NamedList - Typesafe efficient variant - ModernNamedList introduced - implementing the same API as NamedList
> ----------------------------------------------------------------------------------------------------------------------------------------
>
>                 Key: SOLR-912
>                 URL: https://issues.apache.org/jira/browse/SOLR-912
>             Project: Solr
>          Issue Type: Improvement
>          Components: search
>    Affects Versions: 1.4
>         Environment: Tomcat 6, JRE 6, Solr 1.3+ nightlies 
>            Reporter: Kay Kay
>            Priority: Minor
>             Fix For: 1.4
>
>         Attachments: NLProfile.java, SOLR-912.patch, SOLR-912.patch
>
>   Original Estimate: 72h
>  Remaining Estimate: 72h
>
> The implementation of NamedList - while being fast - is not necessarily type-safe. I have implemented an additional implementation of the same - ModernNamedList (a type-safe variation providing the same interface as NamedList) - while preserving the semantics in terms of ordering of elements and allowing null elements for key and values (keys are always Strings , while values correspond to generics ). 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.