You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@avro.apache.org by Stu Hood <st...@rackspace.com> on 2010/04/05 22:17:55 UTC

Using 'bytes' as keys in a 'map' equivalent

Hey gang,

I can understand the reasoning behind AVRO-9, but now I need to look for an alternative to a 'map' that will allow me to store an association of bytes keys to values.

Is there a recommended pattern to store this structure, or is "list of (key,value) tuples" the best option?

Thanks much,

Stu Hood
@stuhood
Architecture Software Developer
Email & Apps Division, Rackspace Hosting


Re: Using 'bytes' as keys in a 'map' equivalent

Posted by Jeff Hodges <jh...@twitter.com>.
Hacky, but acceptable.
--
Jeff

On Apr 5, 2010 5:51 PM, "Doug Cutting" <cu...@apache.org> wrote:

Stu Hood wrote:
>
> I can understand the reasoning behind AVRO-9, but now I need to look for
an alte...
A map of Foo has the same binary format as an array of records, each with a
string field and a Foo field.  So an application can use an array schema
similar to this to represent map-like structures with, e.g., non-string
keys.

Perhaps we could establish standard properties that indicate that a given
array of records should be represented in a map-like way if possible?
 E.g.,:

{"type": "array", "isMap": true, "items": {"type":"record", ...}}

Doug

Re: Using 'bytes' as keys in a 'map' equivalent

Posted by Doug Cutting <cu...@apache.org>.
Stu Hood wrote:
> I can understand the reasoning behind AVRO-9, but now I need to look for an alternative to a 'map' that will allow me to store an association of bytes keys to values.

A map of Foo has the same binary format as an array of records, each 
with a string field and a Foo field.  So an application can use an array 
schema similar to this to represent map-like structures with, e.g., 
non-string keys.

Perhaps we could establish standard properties that indicate that a 
given array of records should be represented in a map-like way if 
possible?  E.g.,:

{"type": "array", "isMap": true, "items": {"type":"record", ...}}

Doug

Re: Using 'bytes' as keys in a 'map' equivalent

Posted by Jeff Hodges <jh...@twitter.com>.
What languages are problematic? Would they also map the bytes type to
the string type like ruby and python?
--
Jeff

On Mon, Apr 5, 2010 at 1:48 PM, Scott Carey <sc...@richrelevance.com> wrote:
> For some other use cases, allowing all intrinsic simple types to be keys would be useful in the future too.   This would make Hive's schema system map column type directly fit into Avro, for example.
>
> But as AVRO-9 points out, this would be problematic in many scripting languages that only support string dictionaries.
>
> Complex types as keys is problematic and IMO should be avoided.
>
> Avro 2.0 perhaps?
>
>
> In the meantime I'm serializing such data as an array of k,v tuples and the client object API deals with providing a map interface.  After all, that is all that a Map is in avro anyway, syntactic sugar around an array with some type checking.
>
> -Scott
>
> On Apr 5, 2010, at 1:24 PM, Jeff Hodges wrote:
>
>> This would probably be of much benefit to the Cassandra community as
>> they/we are working on getting their stored data to be keyed by byte
>> arrays instead of String arrays[1]. Converting back and forth would
>> less good.
>>
>> [1] Relevant ticket: https://issues.apache.org/jira/browse/CASSANDRA-767
>> --
>> Jeff
>>
>> On Mon, Apr 5, 2010 at 1:17 PM, Stu Hood <st...@rackspace.com> wrote:
>>> Hey gang,
>>>
>>> I can understand the reasoning behind AVRO-9, but now I need to look for an alternative to a 'map' that will allow me to store an association of bytes keys to values.
>>>
>>> Is there a recommended pattern to store this structure, or is "list of (key,value) tuples" the best option?
>>>
>>> Thanks much,
>>>
>>> Stu Hood
>>> @stuhood
>>> Architecture Software Developer
>>> Email & Apps Division, Rackspace Hosting
>>>
>>>
>
>

Re: Using 'bytes' as keys in a 'map' equivalent

Posted by Scott Carey <sc...@richrelevance.com>.
For some other use cases, allowing all intrinsic simple types to be keys would be useful in the future too.   This would make Hive's schema system map column type directly fit into Avro, for example.

But as AVRO-9 points out, this would be problematic in many scripting languages that only support string dictionaries.

Complex types as keys is problematic and IMO should be avoided.

Avro 2.0 perhaps?


In the meantime I'm serializing such data as an array of k,v tuples and the client object API deals with providing a map interface.  After all, that is all that a Map is in avro anyway, syntactic sugar around an array with some type checking.

-Scott

On Apr 5, 2010, at 1:24 PM, Jeff Hodges wrote:

> This would probably be of much benefit to the Cassandra community as
> they/we are working on getting their stored data to be keyed by byte
> arrays instead of String arrays[1]. Converting back and forth would
> less good.
> 
> [1] Relevant ticket: https://issues.apache.org/jira/browse/CASSANDRA-767
> --
> Jeff
> 
> On Mon, Apr 5, 2010 at 1:17 PM, Stu Hood <st...@rackspace.com> wrote:
>> Hey gang,
>> 
>> I can understand the reasoning behind AVRO-9, but now I need to look for an alternative to a 'map' that will allow me to store an association of bytes keys to values.
>> 
>> Is there a recommended pattern to store this structure, or is "list of (key,value) tuples" the best option?
>> 
>> Thanks much,
>> 
>> Stu Hood
>> @stuhood
>> Architecture Software Developer
>> Email & Apps Division, Rackspace Hosting
>> 
>> 


Re: Using 'bytes' as keys in a 'map' equivalent

Posted by Jeff Hodges <jh...@twitter.com>.
This would probably be of much benefit to the Cassandra community as
they/we are working on getting their stored data to be keyed by byte
arrays instead of String arrays[1]. Converting back and forth would
less good.

[1] Relevant ticket: https://issues.apache.org/jira/browse/CASSANDRA-767
--
Jeff

On Mon, Apr 5, 2010 at 1:17 PM, Stu Hood <st...@rackspace.com> wrote:
> Hey gang,
>
> I can understand the reasoning behind AVRO-9, but now I need to look for an alternative to a 'map' that will allow me to store an association of bytes keys to values.
>
> Is there a recommended pattern to store this structure, or is "list of (key,value) tuples" the best option?
>
> Thanks much,
>
> Stu Hood
> @stuhood
> Architecture Software Developer
> Email & Apps Division, Rackspace Hosting
>
>