You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@iceberg.apache.org by Chen Song <ch...@gmail.com> on 2021/03/24 18:01:28 UTC

Question on ordering on partitions when read

I want to clarify the ordering semantics (if deterministic) on partitions
returned when using iceberg core data API to read.

Say I define a table with a *time* column and partition by *day(time)*, and
do the following writes.

partition (day)    time                               other data fields
2020-10-01         2020-10-01 01:01:01    ...
2020-10-01         2020-10-01 02:01:01    ...
2020-10-02         2020-10-02 01:01:01    ...
2020-10-02         2020-10-02 02:01:01    ...

Then if I do read all using something like the following.

    IcebergGenerics.read(table).build();

I did see rows returned in the right order in terms of partitions. Then if
I append the same data again and read again. I see rows returned like.

2020-10-01         2020-10-01 01:01:01    ...
2020-10-01         2020-10-01 02:01:01    ...
2020-10-02         2020-10-02 01:01:01    ...
2020-10-02         2020-10-02 02:01:01    ...
2020-10-01         2020-10-01 01:01:01    ...
2020-10-01         2020-10-01 02:01:01    ...
2020-10-02         2020-10-02 01:01:01    ...
2020-10-02         2020-10-02 02:01:01    ...

In other words, the rows returned in the order first by commit time then by
partition *day*. If I want to ensure the data from partition 2020-10-01 is
always returned before  2020-10-02 in the above example, is there a way to
configure the reader to do that? I checked the reader API and cannot seem
to find a method to do that.

Please be noted that I am NOT talking about sorting within a partition,
which I know that has to be enforced by the writer.

-- 
Chen Song

Re: Question on ordering on partitions when read

Posted by Ryan Blue <rb...@netflix.com.INVALID>.
Yeah, I'd use IcebergGenerics to read a table. That's the simplest way.

On Thu, Mar 25, 2021 at 11:49 AM Chen Song <ch...@gmail.com> wrote:

> Thanks Ryan. Reading one partition at a time sounds a logical thing to me
> in my case.
>
> I cannot use a query engine for now. In that case, if IcebergGenerics
> still the best way to read via core API?
>
> On Thu, Mar 25, 2021 at 2:16 PM Ryan Blue <rb...@netflix.com.invalid>
> wrote:
>
>> Hi Chen,
>>
>> Iceberg doesn't guarantee any order for records returned by
>> `IcebergGenerics`. If you want a specific order, I'd recommend using a
>> query engine to sort or to read a partition at a time and then sort within
>> that partition.
>>
>> Iceberg can't really guarantee order across files. The sort order files
>> are written with may change over time, and Iceberg will also use the lack
>> of a guarantee to work faster in some cases. For example, most job planning
>> is done by reading manifest files in parallel so there isn't an order that
>> data files are returned in. Iceberg will also pack files into tasks in most
>> cases (though not for `IcebergGenerics`) so files can be reordered
>> depending on size as well.
>>
>> On Thu, Mar 25, 2021 at 8:06 AM Chen Song <ch...@gmail.com> wrote:
>>
>>> Popping up the question.
>>>
>>> On Wed, Mar 24, 2021 at 2:01 PM Chen Song <ch...@gmail.com>
>>> wrote:
>>>
>>>> I want to clarify the ordering semantics (if deterministic) on
>>>> partitions returned when using iceberg core data API to read.
>>>>
>>>> Say I define a table with a *time* column and partition by *day(time)*,
>>>>  and do the following writes.
>>>>
>>>> partition (day)    time                               other data fields
>>>> 2020-10-01         2020-10-01 01:01:01    ...
>>>> 2020-10-01         2020-10-01 02:01:01    ...
>>>> 2020-10-02         2020-10-02 01:01:01    ...
>>>> 2020-10-02         2020-10-02 02:01:01    ...
>>>>
>>>> Then if I do read all using something like the following.
>>>>
>>>>     IcebergGenerics.read(table).build();
>>>>
>>>> I did see rows returned in the right order in terms of partitions. Then
>>>> if I append the same data again and read again. I see rows returned like.
>>>>
>>>> 2020-10-01         2020-10-01 01:01:01    ...
>>>> 2020-10-01         2020-10-01 02:01:01    ...
>>>> 2020-10-02         2020-10-02 01:01:01    ...
>>>> 2020-10-02         2020-10-02 02:01:01    ...
>>>> 2020-10-01         2020-10-01 01:01:01    ...
>>>> 2020-10-01         2020-10-01 02:01:01    ...
>>>> 2020-10-02         2020-10-02 01:01:01    ...
>>>> 2020-10-02         2020-10-02 02:01:01    ...
>>>>
>>>> In other words, the rows returned in the order first by commit time
>>>> then by partition *day*. If I want to ensure the data from partition
>>>> 2020-10-01 is always returned before  2020-10-02 in the above example, is
>>>> there a way to configure the reader to do that? I checked the reader API
>>>> and cannot seem to find a method to do that.
>>>>
>>>> Please be noted that I am NOT talking about sorting within a partition,
>>>> which I know that has to be enforced by the writer.
>>>>
>>>> --
>>>> Chen Song
>>>>
>>>>
>>>
>>> --
>>> Chen Song
>>>
>>>
>>
>> --
>> Ryan Blue
>> Software Engineer
>> Netflix
>>
>
>
> --
> Chen Song
>
>

-- 
Ryan Blue
Software Engineer
Netflix

Re: Question on ordering on partitions when read

Posted by Chen Song <ch...@gmail.com>.
Thanks Ryan. Reading one partition at a time sounds a logical thing to me
in my case.

I cannot use a query engine for now. In that case, if IcebergGenerics still
the best way to read via core API?

On Thu, Mar 25, 2021 at 2:16 PM Ryan Blue <rb...@netflix.com.invalid> wrote:

> Hi Chen,
>
> Iceberg doesn't guarantee any order for records returned by
> `IcebergGenerics`. If you want a specific order, I'd recommend using a
> query engine to sort or to read a partition at a time and then sort within
> that partition.
>
> Iceberg can't really guarantee order across files. The sort order files
> are written with may change over time, and Iceberg will also use the lack
> of a guarantee to work faster in some cases. For example, most job planning
> is done by reading manifest files in parallel so there isn't an order that
> data files are returned in. Iceberg will also pack files into tasks in most
> cases (though not for `IcebergGenerics`) so files can be reordered
> depending on size as well.
>
> On Thu, Mar 25, 2021 at 8:06 AM Chen Song <ch...@gmail.com> wrote:
>
>> Popping up the question.
>>
>> On Wed, Mar 24, 2021 at 2:01 PM Chen Song <ch...@gmail.com> wrote:
>>
>>> I want to clarify the ordering semantics (if deterministic) on
>>> partitions returned when using iceberg core data API to read.
>>>
>>> Say I define a table with a *time* column and partition by *day(time)*, and
>>> do the following writes.
>>>
>>> partition (day)    time                               other data fields
>>> 2020-10-01         2020-10-01 01:01:01    ...
>>> 2020-10-01         2020-10-01 02:01:01    ...
>>> 2020-10-02         2020-10-02 01:01:01    ...
>>> 2020-10-02         2020-10-02 02:01:01    ...
>>>
>>> Then if I do read all using something like the following.
>>>
>>>     IcebergGenerics.read(table).build();
>>>
>>> I did see rows returned in the right order in terms of partitions. Then
>>> if I append the same data again and read again. I see rows returned like.
>>>
>>> 2020-10-01         2020-10-01 01:01:01    ...
>>> 2020-10-01         2020-10-01 02:01:01    ...
>>> 2020-10-02         2020-10-02 01:01:01    ...
>>> 2020-10-02         2020-10-02 02:01:01    ...
>>> 2020-10-01         2020-10-01 01:01:01    ...
>>> 2020-10-01         2020-10-01 02:01:01    ...
>>> 2020-10-02         2020-10-02 01:01:01    ...
>>> 2020-10-02         2020-10-02 02:01:01    ...
>>>
>>> In other words, the rows returned in the order first by commit time then
>>> by partition *day*. If I want to ensure the data from partition
>>> 2020-10-01 is always returned before  2020-10-02 in the above example, is
>>> there a way to configure the reader to do that? I checked the reader API
>>> and cannot seem to find a method to do that.
>>>
>>> Please be noted that I am NOT talking about sorting within a partition,
>>> which I know that has to be enforced by the writer.
>>>
>>> --
>>> Chen Song
>>>
>>>
>>
>> --
>> Chen Song
>>
>>
>
> --
> Ryan Blue
> Software Engineer
> Netflix
>


-- 
Chen Song

Re: Question on ordering on partitions when read

Posted by Ryan Blue <rb...@netflix.com.INVALID>.
Hi Chen,

Iceberg doesn't guarantee any order for records returned by
`IcebergGenerics`. If you want a specific order, I'd recommend using a
query engine to sort or to read a partition at a time and then sort within
that partition.

Iceberg can't really guarantee order across files. The sort order files are
written with may change over time, and Iceberg will also use the lack of a
guarantee to work faster in some cases. For example, most job planning is
done by reading manifest files in parallel so there isn't an order that
data files are returned in. Iceberg will also pack files into tasks in most
cases (though not for `IcebergGenerics`) so files can be reordered
depending on size as well.

On Thu, Mar 25, 2021 at 8:06 AM Chen Song <ch...@gmail.com> wrote:

> Popping up the question.
>
> On Wed, Mar 24, 2021 at 2:01 PM Chen Song <ch...@gmail.com> wrote:
>
>> I want to clarify the ordering semantics (if deterministic) on partitions
>> returned when using iceberg core data API to read.
>>
>> Say I define a table with a *time* column and partition by *day(time)*, and
>> do the following writes.
>>
>> partition (day)    time                               other data fields
>> 2020-10-01         2020-10-01 01:01:01    ...
>> 2020-10-01         2020-10-01 02:01:01    ...
>> 2020-10-02         2020-10-02 01:01:01    ...
>> 2020-10-02         2020-10-02 02:01:01    ...
>>
>> Then if I do read all using something like the following.
>>
>>     IcebergGenerics.read(table).build();
>>
>> I did see rows returned in the right order in terms of partitions. Then
>> if I append the same data again and read again. I see rows returned like.
>>
>> 2020-10-01         2020-10-01 01:01:01    ...
>> 2020-10-01         2020-10-01 02:01:01    ...
>> 2020-10-02         2020-10-02 01:01:01    ...
>> 2020-10-02         2020-10-02 02:01:01    ...
>> 2020-10-01         2020-10-01 01:01:01    ...
>> 2020-10-01         2020-10-01 02:01:01    ...
>> 2020-10-02         2020-10-02 01:01:01    ...
>> 2020-10-02         2020-10-02 02:01:01    ...
>>
>> In other words, the rows returned in the order first by commit time then
>> by partition *day*. If I want to ensure the data from partition
>> 2020-10-01 is always returned before  2020-10-02 in the above example, is
>> there a way to configure the reader to do that? I checked the reader API
>> and cannot seem to find a method to do that.
>>
>> Please be noted that I am NOT talking about sorting within a partition,
>> which I know that has to be enforced by the writer.
>>
>> --
>> Chen Song
>>
>>
>
> --
> Chen Song
>
>

-- 
Ryan Blue
Software Engineer
Netflix

Re: Question on ordering on partitions when read

Posted by Chen Song <ch...@gmail.com>.
Popping up the question.

On Wed, Mar 24, 2021 at 2:01 PM Chen Song <ch...@gmail.com> wrote:

> I want to clarify the ordering semantics (if deterministic) on partitions
> returned when using iceberg core data API to read.
>
> Say I define a table with a *time* column and partition by *day(time)*, and
> do the following writes.
>
> partition (day)    time                               other data fields
> 2020-10-01         2020-10-01 01:01:01    ...
> 2020-10-01         2020-10-01 02:01:01    ...
> 2020-10-02         2020-10-02 01:01:01    ...
> 2020-10-02         2020-10-02 02:01:01    ...
>
> Then if I do read all using something like the following.
>
>     IcebergGenerics.read(table).build();
>
> I did see rows returned in the right order in terms of partitions. Then if
> I append the same data again and read again. I see rows returned like.
>
> 2020-10-01         2020-10-01 01:01:01    ...
> 2020-10-01         2020-10-01 02:01:01    ...
> 2020-10-02         2020-10-02 01:01:01    ...
> 2020-10-02         2020-10-02 02:01:01    ...
> 2020-10-01         2020-10-01 01:01:01    ...
> 2020-10-01         2020-10-01 02:01:01    ...
> 2020-10-02         2020-10-02 01:01:01    ...
> 2020-10-02         2020-10-02 02:01:01    ...
>
> In other words, the rows returned in the order first by commit time then
> by partition *day*. If I want to ensure the data from partition
> 2020-10-01 is always returned before  2020-10-02 in the above example, is
> there a way to configure the reader to do that? I checked the reader API
> and cannot seem to find a method to do that.
>
> Please be noted that I am NOT talking about sorting within a partition,
> which I know that has to be enforced by the writer.
>
> --
> Chen Song
>
>

-- 
Chen Song