You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@avro.apache.org by "Sharad Agarwal (JIRA)" <ji...@apache.org> on 2009/07/20 05:46:14 UTC

[jira] Created: (AVRO-80) 'reflect' implementation requires Avro classes to define types

'reflect' implementation requires Avro classes to define types
--------------------------------------------------------------

                 Key: AVRO-80
                 URL: https://issues.apache.org/jira/browse/AVRO-80
             Project: Avro
          Issue Type: Bug
          Components: java
            Reporter: Sharad Agarwal


Since 'reflect' implementation is expected to work with existing code, it should work with java types. For example it should work with java String instead of org.apache.avro.util.Utf8.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Work started: (AVRO-80) 'reflect' implementation requires Avro classes to define types

Posted by "Doug Cutting (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/AVRO-80?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Work on AVRO-80 started by Doug Cutting.

> 'reflect' implementation requires Avro classes to define types
> --------------------------------------------------------------
>
>                 Key: AVRO-80
>                 URL: https://issues.apache.org/jira/browse/AVRO-80
>             Project: Avro
>          Issue Type: Bug
>          Components: java
>            Reporter: Sharad Agarwal
>            Assignee: Doug Cutting
>         Attachments: AVRO-80.patch
>
>
> Since 'reflect' implementation is expected to work with existing code, it should work with java types. For example it should work with java String instead of org.apache.avro.util.Utf8.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (AVRO-80) 'reflect' implementation requires Avro classes to define types

Posted by "Doug Cutting (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/AVRO-80?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12780803#action_12780803 ] 

Doug Cutting commented on AVRO-80:
----------------------------------

> Should isString() report true on utf8?

I don't think so.  Reflect should be independent of Avro-specific datatypes.

> Should isBytes understand ByteBuffer?

I think most folks use byte[] for binary data.  If we find otherwise, we can add reflection support for ByteBuffer too.  As you've seen with arrays though, it's tricky to map a single Avro type to two different Java types, so I'd rather avoid doing so unless we need to.

> explain what's going on with the LIST_CLASSES thing

Good idea.  I've added a comment for this.

> Have you considered subclassing Schema to keep the list information, instead of the LIST_CLASSES static?

That's not possible without exposing a lot of internals.  A better long-term approach might be to have schemas support extensible properties.

>  Does it ever come up that you made an array, but then need to lengthen it later, because of the chunked encoding?

Yes.  This becomes a bug in the reflect implementation after this patch is committed: it cannot correctly read chunked data without throwing ArrayIndexOutOfBounds.  But its unlikely the reflect implementation will ever generate blocked data, nor be used to read blocked data.

> Why not just Arrays.asList(o).iterator() here?

This does not work with int[], but only with Integer[].  We need something that works with both.  Plus that allocates two objects instead of one.

> Should we import WeakIdentityHashMap's test too?

None exist.






> 'reflect' implementation requires Avro classes to define types
> --------------------------------------------------------------
>
>                 Key: AVRO-80
>                 URL: https://issues.apache.org/jira/browse/AVRO-80
>             Project: Avro
>          Issue Type: Bug
>          Components: java
>            Reporter: Sharad Agarwal
>            Assignee: Doug Cutting
>         Attachments: AVRO-80.patch, AVRO-80.patch
>
>
> Since 'reflect' implementation is expected to work with existing code, it should work with java types. For example it should work with java String instead of org.apache.avro.util.Utf8.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (AVRO-80) 'reflect' implementation requires Avro classes to define types

Posted by "Philip Zeyliger (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/AVRO-80?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12780461#action_12780461 ] 

Philip Zeyliger commented on AVRO-80:
-------------------------------------

Took a look.  Overall, confusing, but I can see why it's necessary.  Some questions and comments...

* Should isString() report true on utf8?

* Should isBytes understand ByteBuffer?

* Should ReflectData.validate() on java arrays check the type of the array?  Is that do-able?

* ReflectData.getClass() needs to explain what's going on with the
LIST_CLASSES thing.  Leave a note somewhere to explain that
it's a map determining whether to use List<?> or java native
arrays for a given field.

* Have you considered subclassing Schema to keep the list information, instead of the LIST_CLASSES static?

* Why does peekarray return null?

* Does it ever come up that you made an array, but then need to lengthen it later, because of the chunked encoding?

* Why not just Arrays.asList(o).iterator() here?
bq. protected Iterator<Object> getArrayElements
 
* Should we import WeakIdentityHashMap's test too?

* For TestReflect, it would be great to see some tests that
check that exceptions are thrown on schemas that TestReflect doesn't support.

* The current tests don't hit the line:
bq.     Type component = ((GenericArrayType)type).getGenericComponentType();
      


> 'reflect' implementation requires Avro classes to define types
> --------------------------------------------------------------
>
>                 Key: AVRO-80
>                 URL: https://issues.apache.org/jira/browse/AVRO-80
>             Project: Avro
>          Issue Type: Bug
>          Components: java
>            Reporter: Sharad Agarwal
>            Assignee: Doug Cutting
>         Attachments: AVRO-80.patch, AVRO-80.patch
>
>
> Since 'reflect' implementation is expected to work with existing code, it should work with java types. For example it should work with java String instead of org.apache.avro.util.Utf8.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (AVRO-80) 'reflect' implementation requires Avro classes to define types

Posted by "Sharad Agarwal (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/AVRO-80?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12762937#action_12762937 ] 

Sharad Agarwal commented on AVRO-80:
------------------------------------

I think there should be a default and easy way to work with Strings. I came across couple of instances where potential users enquired about this and this limitation deterred them to use Avro.
Should we make the reflect *default* representation of String schema as java.lang.String itself? I agree that performance is the concern but I think for reflect that trade-off is fine.

> 'reflect' implementation requires Avro classes to define types
> --------------------------------------------------------------
>
>                 Key: AVRO-80
>                 URL: https://issues.apache.org/jira/browse/AVRO-80
>             Project: Avro
>          Issue Type: Bug
>          Components: java
>            Reporter: Sharad Agarwal
>
> Since 'reflect' implementation is expected to work with existing code, it should work with java types. For example it should work with java String instead of org.apache.avro.util.Utf8.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Resolved: (AVRO-80) 'reflect' implementation requires Avro classes to define types

Posted by "Doug Cutting (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/AVRO-80?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Doug Cutting resolved AVRO-80.
------------------------------

       Resolution: Fixed
    Fix Version/s: 1.3.0
     Hadoop Flags: [Incompatible change]

I just committed this.

> 'reflect' implementation requires Avro classes to define types
> --------------------------------------------------------------
>
>                 Key: AVRO-80
>                 URL: https://issues.apache.org/jira/browse/AVRO-80
>             Project: Avro
>          Issue Type: Bug
>          Components: java
>            Reporter: Sharad Agarwal
>            Assignee: Doug Cutting
>             Fix For: 1.3.0
>
>         Attachments: AVRO-80.patch, AVRO-80.patch
>
>
> Since 'reflect' implementation is expected to work with existing code, it should work with java types. For example it should work with java String instead of org.apache.avro.util.Utf8.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (AVRO-80) 'reflect' implementation requires Avro classes to define types

Posted by "Doug Cutting (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/AVRO-80?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Doug Cutting updated AVRO-80:
-----------------------------

    Attachment: AVRO-80.patch

Here's a first version of this.  All tests pass.  I added new tests in TestReflect, but should probably add more before we commit this.

> 'reflect' implementation requires Avro classes to define types
> --------------------------------------------------------------
>
>                 Key: AVRO-80
>                 URL: https://issues.apache.org/jira/browse/AVRO-80
>             Project: Avro
>          Issue Type: Bug
>          Components: java
>            Reporter: Sharad Agarwal
>            Assignee: Doug Cutting
>         Attachments: AVRO-80.patch
>
>
> Since 'reflect' implementation is expected to work with existing code, it should work with java types. For example it should work with java String instead of org.apache.avro.util.Utf8.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Assigned: (AVRO-80) 'reflect' implementation requires Avro classes to define types

Posted by "Doug Cutting (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/AVRO-80?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Doug Cutting reassigned AVRO-80:
--------------------------------

    Assignee: Doug Cutting

> 'reflect' implementation requires Avro classes to define types
> --------------------------------------------------------------
>
>                 Key: AVRO-80
>                 URL: https://issues.apache.org/jira/browse/AVRO-80
>             Project: Avro
>          Issue Type: Bug
>          Components: java
>            Reporter: Sharad Agarwal
>            Assignee: Doug Cutting
>
> Since 'reflect' implementation is expected to work with existing code, it should work with java types. For example it should work with java String instead of org.apache.avro.util.Utf8.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (AVRO-80) 'reflect' implementation requires Avro classes to define types

Posted by "Doug Cutting (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/AVRO-80?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12759793#action_12759793 ] 

Doug Cutting commented on AVRO-80:
----------------------------------

Do folks feel we should switch the default from String to Utf8?  This has a performance implications, since Strings are immutable, while a Utf8 can and is re-used.  Also, when strings are created they must decode UTF-8, while Utf8 only decodes on demand.

Alternately, we could include implementations of ReflectData, DatumReader and DatumWriter that represent strings with java.lang.String so that folks can use whichever they prefer.  Is that too confusing?

> 'reflect' implementation requires Avro classes to define types
> --------------------------------------------------------------
>
>                 Key: AVRO-80
>                 URL: https://issues.apache.org/jira/browse/AVRO-80
>             Project: Avro
>          Issue Type: Bug
>          Components: java
>            Reporter: Sharad Agarwal
>
> Since 'reflect' implementation is expected to work with existing code, it should work with java types. For example it should work with java String instead of org.apache.avro.util.Utf8.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (AVRO-80) 'reflect' implementation requires Avro classes to define types

Posted by "Scott Carey (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/AVRO-80?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12774442#action_12774442 ] 

Scott Carey commented on AVRO-80:
---------------------------------

In the future, what Java types Avro reflection maps to can be configurable.

Jackson's annotations, and mix-in annotations might be good for reference when thinking about the sort of flexibility needed.

http://wiki.fasterxml.com/JacksonAnnotations
http://wiki.fasterxml.com/JacksonMixInAnnotations

>From one end: what to do with the reflect case -- to the other: a specific mapping from an object to a schema -- there will need to be a lot of flexibility in the long run and Jackson has a lot of nice examples to keep in mind.

> 'reflect' implementation requires Avro classes to define types
> --------------------------------------------------------------
>
>                 Key: AVRO-80
>                 URL: https://issues.apache.org/jira/browse/AVRO-80
>             Project: Avro
>          Issue Type: Bug
>          Components: java
>            Reporter: Sharad Agarwal
>            Assignee: Doug Cutting
>
> Since 'reflect' implementation is expected to work with existing code, it should work with java types. For example it should work with java String instead of org.apache.avro.util.Utf8.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (AVRO-80) 'reflect' implementation requires Avro classes to define types

Posted by "Doug Cutting (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/AVRO-80?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Doug Cutting updated AVRO-80:
-----------------------------

    Attachment: AVRO-80.patch

Here's a version with more tests and few bugs fixed.

> 'reflect' implementation requires Avro classes to define types
> --------------------------------------------------------------
>
>                 Key: AVRO-80
>                 URL: https://issues.apache.org/jira/browse/AVRO-80
>             Project: Avro
>          Issue Type: Bug
>          Components: java
>            Reporter: Sharad Agarwal
>            Assignee: Doug Cutting
>         Attachments: AVRO-80.patch, AVRO-80.patch
>
>
> Since 'reflect' implementation is expected to work with existing code, it should work with java types. For example it should work with java String instead of org.apache.avro.util.Utf8.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (AVRO-80) 'reflect' implementation requires Avro classes to define types

Posted by "Doug Cutting (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/AVRO-80?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12774432#action_12774432 ] 

Doug Cutting commented on AVRO-80:
----------------------------------

The more I think about it, the more I agree that reflect should be independent of Avro's runtime classes, instead using built-in Java classes.  In particular, reflection should represent Avro strings with java.lang.String and Avro arrays with java.util.ArrayList.  I'll start work on this ASAP.

> 'reflect' implementation requires Avro classes to define types
> --------------------------------------------------------------
>
>                 Key: AVRO-80
>                 URL: https://issues.apache.org/jira/browse/AVRO-80
>             Project: Avro
>          Issue Type: Bug
>          Components: java
>            Reporter: Sharad Agarwal
>            Assignee: Doug Cutting
>
> Since 'reflect' implementation is expected to work with existing code, it should work with java types. For example it should work with java String instead of org.apache.avro.util.Utf8.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (AVRO-80) 'reflect' implementation requires Avro classes to define types

Posted by "Doug Cutting (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/AVRO-80?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12777595#action_12777595 ] 

Doug Cutting commented on AVRO-80:
----------------------------------

The mixin annotations seem like they could be a useful pattern for Avro.  Thanks for pointing them out!

For this issue I just intend to switch the default mapping for strings and arrays to use common, built-in Java types.  I might even try to be clever and map both Foo[] and ArrayList<Foo> to Avro arrays, or I might punt that to a follow-on issue.  Making it configurable with mixin annotations is definitely a follow-on.


> 'reflect' implementation requires Avro classes to define types
> --------------------------------------------------------------
>
>                 Key: AVRO-80
>                 URL: https://issues.apache.org/jira/browse/AVRO-80
>             Project: Avro
>          Issue Type: Bug
>          Components: java
>            Reporter: Sharad Agarwal
>            Assignee: Doug Cutting
>
> Since 'reflect' implementation is expected to work with existing code, it should work with java types. For example it should work with java String instead of org.apache.avro.util.Utf8.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.