You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@thrift.apache.org by "Bryan Duxbury (JIRA)" <ji...@apache.org> on 2009/02/09 20:48:59 UTC
[jira] Updated: (THRIFT-318) Performance of HashSet for enumeration
VALID_VALUES seems poor
[ https://issues.apache.org/jira/browse/THRIFT-318?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Bryan Duxbury updated THRIFT-318:
---------------------------------
Attachment: thrift-318.patch
This patch adds a new custom Set implementation, IntRangeSet, that collapses the values into extents of contiguous values. Then, contains(int) does 2*num extents comparisons. This proves to be faster than HashSet, likely by avoiding the Integer.valueOf autoboxing and Integer.hashcode operation. My tests show that for a variety of different value sets and query values, it's about 60% faster.
I've also amended the java compiler to use IntRangeSet when generating enums. The struct code itself does not change.
> Performance of HashSet for enumeration VALID_VALUES seems poor
> --------------------------------------------------------------
>
> Key: THRIFT-318
> URL: https://issues.apache.org/jira/browse/THRIFT-318
> Project: Thrift
> Issue Type: Improvement
> Components: Compiler (Java)
> Reporter: Bryan Duxbury
> Assignee: Bryan Duxbury
> Priority: Minor
> Fix For: 0.1
>
> Attachments: thrift-318.patch
>
>
> It looks like using a HashSet for the VALID_VALUES set we now put in enumerated types was a bad move, performance-wise. There's a fair amount of HashSet/HashMap/Integer overhead generated.
> I think that the VALID_VALUES should still be a Set, but we can make a TIntRangeSet or something internal to Thrift that's more efficient for our usecases and save some CPU.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.