You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@falcon.apache.org by Satish Mittal <sa...@apache.org> on 2014/04/02 12:21:52 UTC
Re: Review Request 18626: FALCON-284: Hcatalog based feed retention doesn't
work when partition filter spans across multiple partition keys
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/18626/
-----------------------------------------------------------
(Updated April 2, 2014, 10:21 a.m.)
Review request for Falcon and Srikanth Sundarrajan.
Changes
-------
Attaching updated patch with review comments incorporated.
Repository: falcon-git
Description
-------
When an HCatalog based feed is scheduled in falcon, retention only looks at the first partition key that satisfies either of date pattern: yyyy | MM | dd | HH | mm. As a result, it calculates a partition filter that contains only one of these patterns. However if HCatalog table is defined in such a way that date spans across multiple partition keys (year/month/day/hour/minute), then feed retention doesn't delete any partitions that are granular than first level (year).
Diffs (updated)
-----
common/src/main/java/org/apache/falcon/catalog/AbstractCatalogService.java fc9c3b1
common/src/main/java/org/apache/falcon/catalog/HiveCatalogService.java 3c3660e
common/src/main/java/org/apache/falcon/entity/common/FeedDataPath.java 4031e14
retention/src/main/java/org/apache/falcon/retention/FeedEvictor.java a8db52e
webapp/src/test/java/org/apache/falcon/catalog/HiveCatalogServiceIT.java fd004a1
webapp/src/test/java/org/apache/falcon/lifecycle/TableStorageFeedEvictorIT.java 770780e
Diff: https://reviews.apache.org/r/18626/diff/
Testing
-------
- Added new integration tests in TableStorageFeedEvictorIT.java to test retention for an Hcatalog feed where date consists of multiple partitions columns (year/month/day).
- Verified the retention behavior on a test cluster having an Hcatalog based feed partitioned by year/month/day/hour/minute/country.
Thanks,
Satish Mittal
Re: Review Request 18626: FALCON-284: Hcatalog based feed retention doesn't
work when partition filter spans across multiple partition keys
Posted by Satish Mittal <sa...@apache.org>.
> On April 23, 2014, 10:03 p.m., Seetharam Venkatesh wrote:
> > Did you run test-patch profile? If so, I'll verify and commit.
Yes, I had tested with test-patch profile.
- Satish
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/18626/#review41226
-----------------------------------------------------------
On April 2, 2014, 10:21 a.m., Satish Mittal wrote:
>
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/18626/
> -----------------------------------------------------------
>
> (Updated April 2, 2014, 10:21 a.m.)
>
>
> Review request for Falcon and Srikanth Sundarrajan.
>
>
> Repository: falcon-git
>
>
> Description
> -------
>
> When an HCatalog based feed is scheduled in falcon, retention only looks at the first partition key that satisfies either of date pattern: yyyy | MM | dd | HH | mm. As a result, it calculates a partition filter that contains only one of these patterns. However if HCatalog table is defined in such a way that date spans across multiple partition keys (year/month/day/hour/minute), then feed retention doesn't delete any partitions that are granular than first level (year).
>
>
> Diffs
> -----
>
> common/src/main/java/org/apache/falcon/catalog/AbstractCatalogService.java fc9c3b1
> common/src/main/java/org/apache/falcon/catalog/HiveCatalogService.java 3c3660e
> common/src/main/java/org/apache/falcon/entity/common/FeedDataPath.java 4031e14
> retention/src/main/java/org/apache/falcon/retention/FeedEvictor.java a8db52e
> webapp/src/test/java/org/apache/falcon/catalog/HiveCatalogServiceIT.java fd004a1
> webapp/src/test/java/org/apache/falcon/lifecycle/TableStorageFeedEvictorIT.java 770780e
>
> Diff: https://reviews.apache.org/r/18626/diff/
>
>
> Testing
> -------
>
> - Added new integration tests in TableStorageFeedEvictorIT.java to test retention for an Hcatalog feed where date consists of multiple partitions columns (year/month/day).
> - Verified the retention behavior on a test cluster having an Hcatalog based feed partitioned by year/month/day/hour/minute/country.
>
>
> Thanks,
>
> Satish Mittal
>
>
Re: Review Request 18626: FALCON-284: Hcatalog based feed retention doesn't
work when partition filter spans across multiple partition keys
Posted by Seetharam Venkatesh <ve...@innerzeal.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/18626/#review41226
-----------------------------------------------------------
Ship it!
Did you run test-patch profile? If so, I'll verify and commit.
- Seetharam Venkatesh
On April 2, 2014, 10:21 a.m., Satish Mittal wrote:
>
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/18626/
> -----------------------------------------------------------
>
> (Updated April 2, 2014, 10:21 a.m.)
>
>
> Review request for Falcon and Srikanth Sundarrajan.
>
>
> Repository: falcon-git
>
>
> Description
> -------
>
> When an HCatalog based feed is scheduled in falcon, retention only looks at the first partition key that satisfies either of date pattern: yyyy | MM | dd | HH | mm. As a result, it calculates a partition filter that contains only one of these patterns. However if HCatalog table is defined in such a way that date spans across multiple partition keys (year/month/day/hour/minute), then feed retention doesn't delete any partitions that are granular than first level (year).
>
>
> Diffs
> -----
>
> common/src/main/java/org/apache/falcon/catalog/AbstractCatalogService.java fc9c3b1
> common/src/main/java/org/apache/falcon/catalog/HiveCatalogService.java 3c3660e
> common/src/main/java/org/apache/falcon/entity/common/FeedDataPath.java 4031e14
> retention/src/main/java/org/apache/falcon/retention/FeedEvictor.java a8db52e
> webapp/src/test/java/org/apache/falcon/catalog/HiveCatalogServiceIT.java fd004a1
> webapp/src/test/java/org/apache/falcon/lifecycle/TableStorageFeedEvictorIT.java 770780e
>
> Diff: https://reviews.apache.org/r/18626/diff/
>
>
> Testing
> -------
>
> - Added new integration tests in TableStorageFeedEvictorIT.java to test retention for an Hcatalog feed where date consists of multiple partitions columns (year/month/day).
> - Verified the retention behavior on a test cluster having an Hcatalog based feed partitioned by year/month/day/hour/minute/country.
>
>
> Thanks,
>
> Satish Mittal
>
>