You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@avro.apache.org by "Douglas Creager (JIRA)" <ji...@apache.org> on 2011/01/28 18:24:43 UTC

[jira] Created: (AVRO-751) Store schema reference in datum instances

Store schema reference in datum instances
-----------------------------------------

                 Key: AVRO-751
                 URL: https://issues.apache.org/jira/browse/AVRO-751
             Project: Avro
          Issue Type: Improvement
          Components: c
            Reporter: Douglas Creager
            Assignee: Douglas Creager
         Attachments: 0001-Store-schema-reference-in-datum-instances.patch

This is a patch that lets us keep track of which schema an avro_datum_t is an instance of.  This is a breaking API change, but I think it makes the API simpler and more logical.  From the commit message:

    We now keep track of which particular schema an avro_datum_t is an
    instance of.  For primitive values, there's only one possible schema,
    and so we don't store an explicit reference.  For compound values, the
    datum constructors now take in a schema parameter, which is stored in
    the avro_datum_t instance.  For records, enums, and fixeds, this means
    that we don't need to store the name of the schema type anymore, since
    we can get this from the schema.
    
    There were also several functions, which operate on datum instances,
    which needed to take in a schema parameter — avro_datum_to_json, as an
    example.  Those parameters aren't needed anymore, since the datum
    carries a reference to its own schema already.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (AVRO-751) Store schema reference in datum instances

Posted by "Douglas Creager (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/AVRO-751?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Douglas Creager updated AVRO-751:
---------------------------------

    Attachment: 0001-Store-schema-reference-in-datum-instances.patch

> Store schema reference in datum instances
> -----------------------------------------
>
>                 Key: AVRO-751
>                 URL: https://issues.apache.org/jira/browse/AVRO-751
>             Project: Avro
>          Issue Type: Improvement
>          Components: c
>            Reporter: Douglas Creager
>            Assignee: Douglas Creager
>         Attachments: 0001-Store-schema-reference-in-datum-instances.patch
>
>
> This is a patch that lets us keep track of which schema an avro_datum_t is an instance of.  This is a breaking API change, but I think it makes the API simpler and more logical.  From the commit message:
>     We now keep track of which particular schema an avro_datum_t is an
>     instance of.  For primitive values, there's only one possible schema,
>     and so we don't store an explicit reference.  For compound values, the
>     datum constructors now take in a schema parameter, which is stored in
>     the avro_datum_t instance.  For records, enums, and fixeds, this means
>     that we don't need to store the name of the schema type anymore, since
>     we can get this from the schema.
>     
>     There were also several functions, which operate on datum instances,
>     which needed to take in a schema parameter — avro_datum_to_json, as an
>     example.  Those parameters aren't needed anymore, since the datum
>     carries a reference to its own schema already.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (AVRO-751) Store schema reference in datum instances

Posted by "Douglas Creager (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/AVRO-751?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Douglas Creager updated AVRO-751:
---------------------------------

    Status: Patch Available  (was: Open)

> Store schema reference in datum instances
> -----------------------------------------
>
>                 Key: AVRO-751
>                 URL: https://issues.apache.org/jira/browse/AVRO-751
>             Project: Avro
>          Issue Type: Improvement
>          Components: c
>            Reporter: Douglas Creager
>            Assignee: Douglas Creager
>            Priority: Blocker
>             Fix For: 1.5.0
>
>         Attachments: 0001-Store-schema-reference-in-datum-instances.patch
>
>
> This is a patch that lets us keep track of which schema an avro_datum_t is an instance of.  This is a breaking API change, but I think it makes the API simpler and more logical.  From the commit message:
>     We now keep track of which particular schema an avro_datum_t is an
>     instance of.  For primitive values, there's only one possible schema,
>     and so we don't store an explicit reference.  For compound values, the
>     datum constructors now take in a schema parameter, which is stored in
>     the avro_datum_t instance.  For records, enums, and fixeds, this means
>     that we don't need to store the name of the schema type anymore, since
>     we can get this from the schema.
>     
>     There were also several functions, which operate on datum instances,
>     which needed to take in a schema parameter — avro_datum_to_json, as an
>     example.  Those parameters aren't needed anymore, since the datum
>     carries a reference to its own schema already.

-- 
This message is automatically generated by JIRA.
-
For more information on JIRA, see: http://www.atlassian.com/software/jira

       

[jira] Commented: (AVRO-751) Store schema reference in datum instances

Posted by "Douglas Creager (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/AVRO-751?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12989671#comment-12989671 ] 

Douglas Creager commented on AVRO-751:
--------------------------------------

On the C side, we were keeping a strdup-ed copy of the type name for records, enums, and fixeds.  (And being inconsistent about whether we were also storing the namespace.)  This also reduces the memory footprint by only storing a reference to the schema, and getting the names from that.

> Store schema reference in datum instances
> -----------------------------------------
>
>                 Key: AVRO-751
>                 URL: https://issues.apache.org/jira/browse/AVRO-751
>             Project: Avro
>          Issue Type: Improvement
>          Components: c
>            Reporter: Douglas Creager
>            Assignee: Douglas Creager
>         Attachments: 0001-Store-schema-reference-in-datum-instances.patch
>
>
> This is a patch that lets us keep track of which schema an avro_datum_t is an instance of.  This is a breaking API change, but I think it makes the API simpler and more logical.  From the commit message:
>     We now keep track of which particular schema an avro_datum_t is an
>     instance of.  For primitive values, there's only one possible schema,
>     and so we don't store an explicit reference.  For compound values, the
>     datum constructors now take in a schema parameter, which is stored in
>     the avro_datum_t instance.  For records, enums, and fixeds, this means
>     that we don't need to store the name of the schema type anymore, since
>     we can get this from the schema.
>     
>     There were also several functions, which operate on datum instances,
>     which needed to take in a schema parameter — avro_datum_to_json, as an
>     example.  Those parameters aren't needed anymore, since the datum
>     carries a reference to its own schema already.

-- 
This message is automatically generated by JIRA.
-
For more information on JIRA, see: http://www.atlassian.com/software/jira

       

[jira] Commented: (AVRO-751) Store schema reference in datum instances

Posted by "Doug Cutting (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/AVRO-751?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12989294#comment-12989294 ] 

Doug Cutting commented on AVRO-751:
-----------------------------------

This is the way that Java implements this.  It is consistent and convenient to keep a reference to the schema, providing full type information at runtime.  It slightly increases the size of only the map and array headers.

> Store schema reference in datum instances
> -----------------------------------------
>
>                 Key: AVRO-751
>                 URL: https://issues.apache.org/jira/browse/AVRO-751
>             Project: Avro
>          Issue Type: Improvement
>          Components: c
>            Reporter: Douglas Creager
>            Assignee: Douglas Creager
>         Attachments: 0001-Store-schema-reference-in-datum-instances.patch
>
>
> This is a patch that lets us keep track of which schema an avro_datum_t is an instance of.  This is a breaking API change, but I think it makes the API simpler and more logical.  From the commit message:
>     We now keep track of which particular schema an avro_datum_t is an
>     instance of.  For primitive values, there's only one possible schema,
>     and so we don't store an explicit reference.  For compound values, the
>     datum constructors now take in a schema parameter, which is stored in
>     the avro_datum_t instance.  For records, enums, and fixeds, this means
>     that we don't need to store the name of the schema type anymore, since
>     we can get this from the schema.
>     
>     There were also several functions, which operate on datum instances,
>     which needed to take in a schema parameter — avro_datum_to_json, as an
>     example.  Those parameters aren't needed anymore, since the datum
>     carries a reference to its own schema already.

-- 
This message is automatically generated by JIRA.
-
For more information on JIRA, see: http://www.atlassian.com/software/jira

       

[jira] Commented: (AVRO-751) Store schema reference in datum instances

Posted by "Bruce Mitchener (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/AVRO-751?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12988747#comment-12988747 ] 

Bruce Mitchener commented on AVRO-751:
--------------------------------------

Can you explain a bit more what the concrete gain is here?


> Store schema reference in datum instances
> -----------------------------------------
>
>                 Key: AVRO-751
>                 URL: https://issues.apache.org/jira/browse/AVRO-751
>             Project: Avro
>          Issue Type: Improvement
>          Components: c
>            Reporter: Douglas Creager
>            Assignee: Douglas Creager
>         Attachments: 0001-Store-schema-reference-in-datum-instances.patch
>
>
> This is a patch that lets us keep track of which schema an avro_datum_t is an instance of.  This is a breaking API change, but I think it makes the API simpler and more logical.  From the commit message:
>     We now keep track of which particular schema an avro_datum_t is an
>     instance of.  For primitive values, there's only one possible schema,
>     and so we don't store an explicit reference.  For compound values, the
>     datum constructors now take in a schema parameter, which is stored in
>     the avro_datum_t instance.  For records, enums, and fixeds, this means
>     that we don't need to store the name of the schema type anymore, since
>     we can get this from the schema.
>     
>     There were also several functions, which operate on datum instances,
>     which needed to take in a schema parameter — avro_datum_to_json, as an
>     example.  Those parameters aren't needed anymore, since the datum
>     carries a reference to its own schema already.

-- 
This message is automatically generated by JIRA.
-
For more information on JIRA, see: http://www.atlassian.com/software/jira

       

[jira] Updated: (AVRO-751) Store schema reference in datum instances

Posted by "Douglas Creager (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/AVRO-751?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Douglas Creager updated AVRO-751:
---------------------------------

    Resolution: Fixed
        Status: Resolved  (was: Patch Available)

I've gone ahead and committed this to SVN, to make sure that the API change gets in for 1.5.0.  We can open up a new issue to address the conflict with your Win32 patch for 1.5.1.

> Store schema reference in datum instances
> -----------------------------------------
>
>                 Key: AVRO-751
>                 URL: https://issues.apache.org/jira/browse/AVRO-751
>             Project: Avro
>          Issue Type: Improvement
>          Components: c
>            Reporter: Douglas Creager
>            Assignee: Douglas Creager
>            Priority: Blocker
>             Fix For: 1.5.0
>
>         Attachments: 0001-Store-schema-reference-in-datum-instances.patch
>
>
> This is a patch that lets us keep track of which schema an avro_datum_t is an instance of.  This is a breaking API change, but I think it makes the API simpler and more logical.  From the commit message:
>     We now keep track of which particular schema an avro_datum_t is an
>     instance of.  For primitive values, there's only one possible schema,
>     and so we don't store an explicit reference.  For compound values, the
>     datum constructors now take in a schema parameter, which is stored in
>     the avro_datum_t instance.  For records, enums, and fixeds, this means
>     that we don't need to store the name of the schema type anymore, since
>     we can get this from the schema.
>     
>     There were also several functions, which operate on datum instances,
>     which needed to take in a schema parameter — avro_datum_to_json, as an
>     example.  Those parameters aren't needed anymore, since the datum
>     carries a reference to its own schema already.

-- 
This message is automatically generated by JIRA.
-
For more information on JIRA, see: http://www.atlassian.com/software/jira

       

[jira] Commented: (AVRO-751) Store schema reference in datum instances

Posted by "Bruce Mitchener (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/AVRO-751?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12989675#comment-12989675 ] 

Bruce Mitchener commented on AVRO-751:
--------------------------------------

That sounds like a good win.

This conflicts with work that I'm hoping to finish up by tomorrow for a Win32 port ... so we'll have to resolve that in the next couple of days.

> Store schema reference in datum instances
> -----------------------------------------
>
>                 Key: AVRO-751
>                 URL: https://issues.apache.org/jira/browse/AVRO-751
>             Project: Avro
>          Issue Type: Improvement
>          Components: c
>            Reporter: Douglas Creager
>            Assignee: Douglas Creager
>         Attachments: 0001-Store-schema-reference-in-datum-instances.patch
>
>
> This is a patch that lets us keep track of which schema an avro_datum_t is an instance of.  This is a breaking API change, but I think it makes the API simpler and more logical.  From the commit message:
>     We now keep track of which particular schema an avro_datum_t is an
>     instance of.  For primitive values, there's only one possible schema,
>     and so we don't store an explicit reference.  For compound values, the
>     datum constructors now take in a schema parameter, which is stored in
>     the avro_datum_t instance.  For records, enums, and fixeds, this means
>     that we don't need to store the name of the schema type anymore, since
>     we can get this from the schema.
>     
>     There were also several functions, which operate on datum instances,
>     which needed to take in a schema parameter — avro_datum_to_json, as an
>     example.  Those parameters aren't needed anymore, since the datum
>     carries a reference to its own schema already.

-- 
This message is automatically generated by JIRA.
-
For more information on JIRA, see: http://www.atlassian.com/software/jira

       

[jira] Updated: (AVRO-751) Store schema reference in datum instances

Posted by "Douglas Creager (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/AVRO-751?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Douglas Creager updated AVRO-751:
---------------------------------

         Priority: Blocker  (was: Major)
    Fix Version/s: 1.5.0

> Store schema reference in datum instances
> -----------------------------------------
>
>                 Key: AVRO-751
>                 URL: https://issues.apache.org/jira/browse/AVRO-751
>             Project: Avro
>          Issue Type: Improvement
>          Components: c
>            Reporter: Douglas Creager
>            Assignee: Douglas Creager
>            Priority: Blocker
>             Fix For: 1.5.0
>
>         Attachments: 0001-Store-schema-reference-in-datum-instances.patch
>
>
> This is a patch that lets us keep track of which schema an avro_datum_t is an instance of.  This is a breaking API change, but I think it makes the API simpler and more logical.  From the commit message:
>     We now keep track of which particular schema an avro_datum_t is an
>     instance of.  For primitive values, there's only one possible schema,
>     and so we don't store an explicit reference.  For compound values, the
>     datum constructors now take in a schema parameter, which is stored in
>     the avro_datum_t instance.  For records, enums, and fixeds, this means
>     that we don't need to store the name of the schema type anymore, since
>     we can get this from the schema.
>     
>     There were also several functions, which operate on datum instances,
>     which needed to take in a schema parameter — avro_datum_to_json, as an
>     example.  Those parameters aren't needed anymore, since the datum
>     carries a reference to its own schema already.

-- 
This message is automatically generated by JIRA.
-
For more information on JIRA, see: http://www.atlassian.com/software/jira

       

[jira] Commented: (AVRO-751) Store schema reference in datum instances

Posted by "Douglas Creager (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/AVRO-751?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12989883#comment-12989883 ] 

Douglas Creager commented on AVRO-751:
--------------------------------------

Cool.  I'm at the Strata conference this week, so my availability might be spotty, but let me know how I can help.  Is there a JIRA entry for the Win32 port yet?

> Store schema reference in datum instances
> -----------------------------------------
>
>                 Key: AVRO-751
>                 URL: https://issues.apache.org/jira/browse/AVRO-751
>             Project: Avro
>          Issue Type: Improvement
>          Components: c
>            Reporter: Douglas Creager
>            Assignee: Douglas Creager
>         Attachments: 0001-Store-schema-reference-in-datum-instances.patch
>
>
> This is a patch that lets us keep track of which schema an avro_datum_t is an instance of.  This is a breaking API change, but I think it makes the API simpler and more logical.  From the commit message:
>     We now keep track of which particular schema an avro_datum_t is an
>     instance of.  For primitive values, there's only one possible schema,
>     and so we don't store an explicit reference.  For compound values, the
>     datum constructors now take in a schema parameter, which is stored in
>     the avro_datum_t instance.  For records, enums, and fixeds, this means
>     that we don't need to store the name of the schema type anymore, since
>     we can get this from the schema.
>     
>     There were also several functions, which operate on datum instances,
>     which needed to take in a schema parameter — avro_datum_to_json, as an
>     example.  Those parameters aren't needed anymore, since the datum
>     carries a reference to its own schema already.

-- 
This message is automatically generated by JIRA.
-
For more information on JIRA, see: http://www.atlassian.com/software/jira