You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@avro.apache.org by "Bruce Mitchener (JIRA)" <ji...@apache.org> on 2010/03/15 06:09:27 UTC

[jira] Created: (AVRO-464) Some hash tables can be arrays instead.

Some hash tables can be arrays instead.
---------------------------------------

                 Key: AVRO-464
                 URL: https://issues.apache.org/jira/browse/AVRO-464
             Project: Avro
          Issue Type: Improvement
          Components: c
            Reporter: Bruce Mitchener
            Assignee: Bruce Mitchener


As far as I can tell, there's no need for the integer->string hash tables in schemas or records ... they can be arrays instead.


-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (AVRO-464) Some hash tables can be arrays instead.

Posted by "Bruce Mitchener (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/AVRO-464?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12845193#action_12845193 ] 

Bruce Mitchener commented on AVRO-464:
--------------------------------------

But right now, the code is iterating 0 to num entries so that still would not work. My thought was to progressively rework this from where it is to using an open addressed hash instead and then it can be more efficient and still support sparse entries in the future. I think we can get rid of a fair bit of overhead in the current implementation.

I will have an experimental patch soon...

> Some hash tables can be arrays instead.
> ---------------------------------------
>
>                 Key: AVRO-464
>                 URL: https://issues.apache.org/jira/browse/AVRO-464
>             Project: Avro
>          Issue Type: Improvement
>          Components: c
>            Reporter: Bruce Mitchener
>            Assignee: Bruce Mitchener
>
> As far as I can tell, there's no need for the integer->string hash tables in schemas or records ... they can be arrays instead.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (AVRO-464) Some hash tables can be arrays instead.

Posted by "Matt Massie (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/AVRO-464?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12845191#action_12845191 ] 

Matt Massie commented on AVRO-464:
----------------------------------

That's correct given the current API that only allows you to append elements.

My thought was that we might want to support sparse arrays in the future and the current integer hash tables easily support that.

> Some hash tables can be arrays instead.
> ---------------------------------------
>
>                 Key: AVRO-464
>                 URL: https://issues.apache.org/jira/browse/AVRO-464
>             Project: Avro
>          Issue Type: Improvement
>          Components: c
>            Reporter: Bruce Mitchener
>            Assignee: Bruce Mitchener
>
> As far as I can tell, there's no need for the integer->string hash tables in schemas or records ... they can be arrays instead.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (AVRO-464) Rework internals of records and schemas for greater performance

Posted by "Bruce Mitchener (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/AVRO-464?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Bruce Mitchener updated AVRO-464:
---------------------------------

    Summary: Rework internals of records and schemas for greater performance  (was: Some hash tables can be arrays instead.)

I am broadening the scope of this. I have some larger changes underway and it will be easier as a single patch.

The initial changes led to a change from 4.2 to 3.7 seconds for serializing the same record object 10,000,000 times without validation. I think we can do better though.

> Rework internals of records and schemas for greater performance
> ---------------------------------------------------------------
>
>                 Key: AVRO-464
>                 URL: https://issues.apache.org/jira/browse/AVRO-464
>             Project: Avro
>          Issue Type: Improvement
>          Components: c
>            Reporter: Bruce Mitchener
>            Assignee: Bruce Mitchener
>
> As far as I can tell, there's no need for the integer->string hash tables in schemas or records ... they can be arrays instead.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.