You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@avro.apache.org by sammefford <sa...@mefford.org> on 2013/03/21 19:26:18 UTC

Where are the rows in Trevni format?

I read the Trevni Specificaiton:
http://avro.apache.org/docs/1.7.4/trevni/spec.html
and I can't see where the row ids are stored for each value in each column. 
Am I missing something obvious?  Is the spec incomplete on that point?

Also, to confirm, my understanding is columnar formats are efficient because
they store column values sorted and can thereby find specific values or
ranges of values quickly.  While the spec mentions the benefits of sorting,
I don't see a requirement that column values be sorted.  Can we depend that
the blocks of column values are sorted?

Thanks,

Sam Mefford
Chief Architect-Big Data Solutions
Avalon Consluting, LLC.
801-706-9731



--
View this message in context: http://apache-avro.679487.n3.nabble.com/Where-are-the-rows-in-Trevni-format-tp4026663.html
Sent from the Avro - Users mailing list archive at Nabble.com.

Re: Where are the rows in Trevni format?

Posted by Doug Cutting <cu...@apache.org>.
Row numbers are not stored explicitly.  They are the implicit in the
ordinal position of values in the file.

Values are not sorted but are in row order.  The primary performance
advantage of a columnar file is that, when only a subset of columns
are required, only a subset of the data need be read.

Doug

On Thu, Mar 21, 2013 at 11:26 AM, sammefford <sa...@mefford.org> wrote:
> I read the Trevni Specificaiton:
> http://avro.apache.org/docs/1.7.4/trevni/spec.html
> and I can't see where the row ids are stored for each value in each column.
> Am I missing something obvious?  Is the spec incomplete on that point?
>
> Also, to confirm, my understanding is columnar formats are efficient because
> they store column values sorted and can thereby find specific values or
> ranges of values quickly.  While the spec mentions the benefits of sorting,
> I don't see a requirement that column values be sorted.  Can we depend that
> the blocks of column values are sorted?
>
> Thanks,
>
> Sam Mefford
> Chief Architect-Big Data Solutions
> Avalon Consluting, LLC.
> 801-706-9731
>
>
>
> --
> View this message in context: http://apache-avro.679487.n3.nabble.com/Where-are-the-rows-in-Trevni-format-tp4026663.html
> Sent from the Avro - Users mailing list archive at Nabble.com.