You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@impala.apache.org by superwemanhella <go...@gmail.com> on 2023/04/02 13:49:40 UTC

I would like to work on IMPALA-11993

Hello

I want to work on IMPALA-11993 and need the privilege to self-assign JIRA
stories. My JIRA username is Gia3.

*Approach -*
As per the documentation here -
https://impala.apache.org/docs/build/asf-site-html/topics/impala_s3.html,
it says that

   -

   The TRUNCATE TABLE
   <https://impala.apache.org/docs/build/asf-site-html/topics/impala_truncate_table.html#truncate_table>
statement
   always removes the corresponding data files from S3 when the table is
   truncated.


But we would want to update the section 'Writing Iceberg tables' under
https://impala.apache.org/docs/build/asf-site-html/topics/impala_iceberg.html
with
these details as mentioned in the story -

*However, for Iceberg it's not the case as Iceberg 'only' creates a new
snapshot for the table but doesn't delete the files, because for time
travel they could be relevant for older snapshots.*

Please let me know if this understanding is correct. Also, why can't we
have a search bar in this documentation, so finding keywords is easy?

https://impala.apache.org/docs/build/asf-site-html/topics/impala_iceberg.html

<https://impala.apache.org/docs/build/asf-site-html/topics/impala_iceberg.html>

*Thank You*
*Gowthami B*

Re: I would like to work on IMPALA-11993

Posted by Zoltán Borók-Nagy <bo...@cloudera.com>.
Hey,

Thanks for your interest in the project.
Yeah, your understanding is correct, for Iceberg tables (and actually also
for Hive ACID tables) Impala just creates a new empty snapshot.
So the operation doesn't affect concurrent readers, also time-travel
queries can still reach older snapshots as Csaba mentioned.

Feel free to add me as reviewer on your gerrit CR.

Cheers,
    Zoltan


On Mon, Apr 3, 2023 at 9:02 AM Csaba Ringhofer <cs...@cloudera.com>
wrote:

> Hi!
>
> Added you as a contributor to Impala (welcome!) and assigned IMPALA-11993.
> Did a quick check and Impala doesn't delete files during truncate in
> Iceberg tables. This allows the content before truncation to be reached
> through time travel.
> I agree that we should specifically mention that truncate and insert
> overwrite do not delete the old files. I am not that familiar with Iceberg
> though, so I  would prefer someone with more experience to comment on this
> too.
>
> >Also, why can't we have a search bar in this documentation, so finding
> keywords is easy?
> That's a good idea!
>
> Csaba
>
>
> On Sun, Apr 2, 2023 at 4:44 PM superwemanhella <
> gowthamibhogireddy@gmail.com>
> wrote:
>
> > Hello
> >
> > I want to work on IMPALA-11993 and need the privilege to self-assign JIRA
> > stories. My JIRA username is Gia3.
> >
> > *Approach -*
> > As per the documentation here -
> > https://impala.apache.org/docs/build/asf-site-html/topics/impala_s3.html
> ,
> > it says that
> >
> >    -
> >
> >    The TRUNCATE TABLE
> >    <
> >
> https://impala.apache.org/docs/build/asf-site-html/topics/impala_truncate_table.html#truncate_table
> > >
> > statement
> >    always removes the corresponding data files from S3 when the table is
> >    truncated.
> >
> >
> > But we would want to update the section 'Writing Iceberg tables' under
> >
> >
> https://impala.apache.org/docs/build/asf-site-html/topics/impala_iceberg.html
> > with
> > these details as mentioned in the story -
> >
> > *However, for Iceberg it's not the case as Iceberg 'only' creates a new
> > snapshot for the table but doesn't delete the files, because for time
> > travel they could be relevant for older snapshots.*
> >
> > Please let me know if this understanding is correct. Also, why can't we
> > have a search bar in this documentation, so finding keywords is easy?
> >
> >
> >
> https://impala.apache.org/docs/build/asf-site-html/topics/impala_iceberg.html
> >
> > <
> >
> https://impala.apache.org/docs/build/asf-site-html/topics/impala_iceberg.html
> > >
> >
> > *Thank You*
> > *Gowthami B*
> >
>

Re: I would like to work on IMPALA-11993

Posted by Csaba Ringhofer <cs...@cloudera.com>.
Hi!

Added you as a contributor to Impala (welcome!) and assigned IMPALA-11993.
Did a quick check and Impala doesn't delete files during truncate in
Iceberg tables. This allows the content before truncation to be reached
through time travel.
I agree that we should specifically mention that truncate and insert
overwrite do not delete the old files. I am not that familiar with Iceberg
though, so I  would prefer someone with more experience to comment on this
too.

>Also, why can't we have a search bar in this documentation, so finding
keywords is easy?
That's a good idea!

Csaba


On Sun, Apr 2, 2023 at 4:44 PM superwemanhella <go...@gmail.com>
wrote:

> Hello
>
> I want to work on IMPALA-11993 and need the privilege to self-assign JIRA
> stories. My JIRA username is Gia3.
>
> *Approach -*
> As per the documentation here -
> https://impala.apache.org/docs/build/asf-site-html/topics/impala_s3.html,
> it says that
>
>    -
>
>    The TRUNCATE TABLE
>    <
> https://impala.apache.org/docs/build/asf-site-html/topics/impala_truncate_table.html#truncate_table
> >
> statement
>    always removes the corresponding data files from S3 when the table is
>    truncated.
>
>
> But we would want to update the section 'Writing Iceberg tables' under
>
> https://impala.apache.org/docs/build/asf-site-html/topics/impala_iceberg.html
> with
> these details as mentioned in the story -
>
> *However, for Iceberg it's not the case as Iceberg 'only' creates a new
> snapshot for the table but doesn't delete the files, because for time
> travel they could be relevant for older snapshots.*
>
> Please let me know if this understanding is correct. Also, why can't we
> have a search bar in this documentation, so finding keywords is easy?
>
>
> https://impala.apache.org/docs/build/asf-site-html/topics/impala_iceberg.html
>
> <
> https://impala.apache.org/docs/build/asf-site-html/topics/impala_iceberg.html
> >
>
> *Thank You*
> *Gowthami B*
>