You are viewing a plain text version of this content. The canonical link for it is here.
Posted to j-dev@xerces.apache.org by "Roland Illig (JIRA)" <xe...@xml.apache.org> on 2010/08/14 08:03:16 UTC

[jira] Created: (XERCESJ-1464) SymbolTable.hash uses only 27 bits, when 31 are available

SymbolTable.hash uses only 27 bits, when 31 are available
---------------------------------------------------------

                 Key: XERCESJ-1464
                 URL: https://issues.apache.org/jira/browse/XERCESJ-1464
             Project: Xerces2-J
          Issue Type: Improvement
    Affects Versions: 2.10.0
         Environment: any
            Reporter: Roland Illig
            Priority: Minor


The hash(...) functions mask out the five most significant bits. The usual approach is to only mask out the sign bit.

This naturally raises the question whether the omission of one F in 0x7FF_FFFF was intentional or accidental.

Even if it is changed to 0x7FFF_FFFF there shouldn't be large performance improvements since the distribution of the entries wouldn't be much better than now.

In fact, I'm just curious why you chose 0x7FF_FFFF.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: j-dev-unsubscribe@xerces.apache.org
For additional commands, e-mail: j-dev-help@xerces.apache.org


[jira] Updated: (XERCESJ-1464) SymbolTable.hash uses only 27 bits, when 31 are available

Posted by "Michael Glavassevich (JIRA)" <xe...@xml.apache.org>.
     [ https://issues.apache.org/jira/browse/XERCESJ-1464?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Michael Glavassevich updated XERCESJ-1464:
------------------------------------------

    Component/s: Other

> SymbolTable.hash uses only 27 bits, when 31 are available
> ---------------------------------------------------------
>
>                 Key: XERCESJ-1464
>                 URL: https://issues.apache.org/jira/browse/XERCESJ-1464
>             Project: Xerces2-J
>          Issue Type: Improvement
>          Components: Other
>    Affects Versions: 2.10.0
>         Environment: any
>            Reporter: Roland Illig
>            Priority: Minor
>   Original Estimate: 0.5h
>  Remaining Estimate: 0.5h
>
> The hash(...) functions mask out the five most significant bits. The usual approach is to only mask out the sign bit.
> This naturally raises the question whether the omission of one F in 0x7FF_FFFF was intentional or accidental.
> Even if it is changed to 0x7FFF_FFFF there shouldn't be large performance improvements since the distribution of the entries wouldn't be much better than now.
> In fact, I'm just curious why you chose 0x7FF_FFFF.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: j-dev-unsubscribe@xerces.apache.org
For additional commands, e-mail: j-dev-help@xerces.apache.org


[jira] Assigned: (XERCESJ-1464) SymbolTable.hash uses only 27 bits, when 31 are available

Posted by "Michael Glavassevich (JIRA)" <xe...@xml.apache.org>.
     [ https://issues.apache.org/jira/browse/XERCESJ-1464?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Michael Glavassevich reassigned XERCESJ-1464:
---------------------------------------------

    Assignee: Michael Glavassevich

> SymbolTable.hash uses only 27 bits, when 31 are available
> ---------------------------------------------------------
>
>                 Key: XERCESJ-1464
>                 URL: https://issues.apache.org/jira/browse/XERCESJ-1464
>             Project: Xerces2-J
>          Issue Type: Improvement
>          Components: Other
>    Affects Versions: 2.10.0
>         Environment: any
>            Reporter: Roland Illig
>            Assignee: Michael Glavassevich
>            Priority: Minor
>   Original Estimate: 0.5h
>  Remaining Estimate: 0.5h
>
> The hash(...) functions mask out the five most significant bits. The usual approach is to only mask out the sign bit.
> This naturally raises the question whether the omission of one F in 0x7FF_FFFF was intentional or accidental.
> Even if it is changed to 0x7FFF_FFFF there shouldn't be large performance improvements since the distribution of the entries wouldn't be much better than now.
> In fact, I'm just curious why you chose 0x7FF_FFFF.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: j-dev-unsubscribe@xerces.apache.org
For additional commands, e-mail: j-dev-help@xerces.apache.org


[jira] Commented: (XERCESJ-1464) SymbolTable.hash uses only 27 bits, when 31 are available

Posted by "Michael Glavassevich (JIRA)" <xe...@xml.apache.org>.
    [ https://issues.apache.org/jira/browse/XERCESJ-1464?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12898602#action_12898602 ] 

Michael Glavassevich commented on XERCESJ-1464:
-----------------------------------------------

I'm sure it was just accidental. At the time this would have been written SymbolTable was a fixed size (101) so wouldn't have had any impact. We added rehash later so switching to 31 bits now could improve distribution in a very large SymbolTable. Thanks for the suggestion.

> SymbolTable.hash uses only 27 bits, when 31 are available
> ---------------------------------------------------------
>
>                 Key: XERCESJ-1464
>                 URL: https://issues.apache.org/jira/browse/XERCESJ-1464
>             Project: Xerces2-J
>          Issue Type: Improvement
>          Components: Other
>    Affects Versions: 2.10.0
>         Environment: any
>            Reporter: Roland Illig
>            Priority: Minor
>   Original Estimate: 0.5h
>  Remaining Estimate: 0.5h
>
> The hash(...) functions mask out the five most significant bits. The usual approach is to only mask out the sign bit.
> This naturally raises the question whether the omission of one F in 0x7FF_FFFF was intentional or accidental.
> Even if it is changed to 0x7FFF_FFFF there shouldn't be large performance improvements since the distribution of the entries wouldn't be much better than now.
> In fact, I'm just curious why you chose 0x7FF_FFFF.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: j-dev-unsubscribe@xerces.apache.org
For additional commands, e-mail: j-dev-help@xerces.apache.org


[jira] Resolved: (XERCESJ-1464) SymbolTable.hash uses only 27 bits, when 31 are available

Posted by "Michael Glavassevich (JIRA)" <xe...@xml.apache.org>.
     [ https://issues.apache.org/jira/browse/XERCESJ-1464?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Michael Glavassevich resolved XERCESJ-1464.
-------------------------------------------

    Fix Version/s: 2.11.0
       Resolution: Fixed

Improvement committed in SVN rev 985518.

> SymbolTable.hash uses only 27 bits, when 31 are available
> ---------------------------------------------------------
>
>                 Key: XERCESJ-1464
>                 URL: https://issues.apache.org/jira/browse/XERCESJ-1464
>             Project: Xerces2-J
>          Issue Type: Improvement
>          Components: Other
>    Affects Versions: 2.10.0
>         Environment: any
>            Reporter: Roland Illig
>            Assignee: Michael Glavassevich
>            Priority: Minor
>             Fix For: 2.11.0
>
>   Original Estimate: 0.5h
>  Remaining Estimate: 0.5h
>
> The hash(...) functions mask out the five most significant bits. The usual approach is to only mask out the sign bit.
> This naturally raises the question whether the omission of one F in 0x7FF_FFFF was intentional or accidental.
> Even if it is changed to 0x7FFF_FFFF there shouldn't be large performance improvements since the distribution of the entries wouldn't be much better than now.
> In fact, I'm just curious why you chose 0x7FF_FFFF.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: j-dev-unsubscribe@xerces.apache.org
For additional commands, e-mail: j-dev-help@xerces.apache.org