You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@avro.apache.org by yael aharon <ya...@gmail.com> on 2014/03/04 17:53:28 UTC

Saving arbitrary data to avro files

Hello,
I am writing a C++ library that stores arbitrary data to avro files.
The schema is given to me through my library's API.
All the data is given to me in the form of strings; including integers,
doubles, etc.
Is there a way for me to store this data to avro files without converting
the strings to the correct types first? I am concerned about the
performance impact that this conversion would have
thanks, Yael

Re: Saving arbitrary data to avro files

Posted by yael aharon <ya...@gmail.com>.
Thank you for your reply.
If I use a string only schema, how would the reader get to know the correct
schema?


On Tue, Mar 4, 2014 at 1:08 PM, Harsh J <ha...@cloudera.com> wrote:

> You can do this, sure. You just need a schema of string type or something
> similar.
>
> Are you not concerned about the read time of the data you plan to store as
> strings? Typically you write once and read more than once during processing.
>
> Storing the data types in proper serialized form would help greatly during
> reads.
> On Mar 4, 2014 8:54 AM, "yael aharon" <ya...@gmail.com> wrote:
>
>> Hello,
>> I am writing a C++ library that stores arbitrary data to avro files.
>> The schema is given to me through my library's API.
>> All the data is given to me in the form of strings; including integers,
>> doubles, etc.
>> Is there a way for me to store this data to avro files without converting
>> the strings to the correct types first? I am concerned about the
>> performance impact that this conversion would have
>> thanks, Yael
>>
>

Re: Saving arbitrary data to avro files

Posted by Harsh J <ha...@cloudera.com>.
You can do this, sure. You just need a schema of string type or something
similar.

Are you not concerned about the read time of the data you plan to store as
strings? Typically you write once and read more than once during processing.

Storing the data types in proper serialized form would help greatly during
reads.
On Mar 4, 2014 8:54 AM, "yael aharon" <ya...@gmail.com> wrote:

> Hello,
> I am writing a C++ library that stores arbitrary data to avro files.
> The schema is given to me through my library's API.
> All the data is given to me in the form of strings; including integers,
> doubles, etc.
> Is there a way for me to store this data to avro files without converting
> the strings to the correct types first? I am concerned about the
> performance impact that this conversion would have
> thanks, Yael
>