You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@iceberg.apache.org by GitBox <gi...@apache.org> on 2022/11/14 22:57:49 UTC

[GitHub] [iceberg] jessiedanwang opened a new issue, #6188: Is it possible to partition on effective and expiration date for Iceberg data

jessiedanwang opened a new issue, #6188:
URL: https://github.com/apache/iceberg/issues/6188

   ### Query engine
   
   Spark
   
   ### Question
   
   I am wondering if it is possible to use both effective and expiration date as partition column for SCD type 2 dimension data. The problem is that the dimension dataset is huge, and we would like to partition the dataset using both effective and expiration date so that we can filter out irrelevant data. Here is an example,
   
   create table mytable (id bigint autoincrement, name text, city text, effective date);
   
   insert into mytable values ('Jen', 'Austin', '2017-01-01');
   insert into mytable values ('Mike', 'Austin', '2017-07-01');
   Upsert mytable values ('Jen', 'Tokyo', '2018-01-01');
   
   what's in mytable
   id name city effective
   1 Jen Tokyo 2018-01-01
   2 Mike Austin 2017-07-01
   
   Traditional scd 2 state based on the same event stream:
   create table mytable_scd2 (id bigint autoincrement, dimid bigint, name text, city text)
   partitioned by (effective bigint, expiration bigint)
   
   what's in mytable_scd2
   1 1 Ken Austin '2017-01-01' '2018-01-01' <--- this row would change partition when it goes from null to a value
   2 2 Mark Austin '2017-07-01' null
   3 1 Ken Tokyo '2018-01-01' null
   
   Given the above example, my question is whether the row (1 1 Ken Austin '2017-01-01' '2018-01-01') will change to a different partition if the expiration date has been updated from null to '2018-01-01'? 
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] github-actions[bot] closed issue #6188: Is it possible to partition on effective and expiration date for Iceberg data

Posted by "github-actions[bot] (via GitHub)" <gi...@apache.org>.
github-actions[bot] closed issue #6188: Is it possible to partition on effective and expiration date for Iceberg data
URL: https://github.com/apache/iceberg/issues/6188


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] github-actions[bot] commented on issue #6188: Is it possible to partition on effective and expiration date for Iceberg data

Posted by "github-actions[bot] (via GitHub)" <gi...@apache.org>.
github-actions[bot] commented on issue #6188:
URL: https://github.com/apache/iceberg/issues/6188#issuecomment-1546773766

   This issue has been automatically marked as stale because it has been open for 180 days with no activity. It will be closed in next 14 days if no further activity occurs. To permanently prevent this issue from being considered stale, add the label 'not-stale', but commenting on the issue is preferred when possible.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] github-actions[bot] commented on issue #6188: Is it possible to partition on effective and expiration date for Iceberg data

Posted by "github-actions[bot] (via GitHub)" <gi...@apache.org>.
github-actions[bot] commented on issue #6188:
URL: https://github.com/apache/iceberg/issues/6188#issuecomment-1565753125

   This issue has been closed because it has not received any activity in the last 14 days since being marked as 'stale'


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org