You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@hbase.apache.org by "Andrew Purtell (JIRA)" <ji...@apache.org> on 2009/09/11 20:26:57 UTC
[jira] Assigned: (HBASE-47) option to set TTL for columns in hbase
[ https://issues.apache.org/jira/browse/HBASE-47?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Andrew Purtell reassigned HBASE-47:
-----------------------------------
Assignee: Andrew Purtell
> option to set TTL for columns in hbase
> --------------------------------------
>
> Key: HBASE-47
> URL: https://issues.apache.org/jira/browse/HBASE-47
> Project: Hadoop HBase
> Issue Type: New Feature
> Components: hql, regionserver
> Reporter: Billy Pearson
> Assignee: Andrew Purtell
> Priority: Minor
> Fix For: 0.2.0
>
> Attachments: hbase-ttl-0.2-r652401.patch, hbase-ttl-0.2-r652725.patch, hbase-ttl-0.2-r652919.patch
>
>
> I would like to see the option to have a TTL on the columns in hbase this feature could be helpfully in removing stale data from large datasets with out havening to do a full scan of the dataset and then issuing deletes.
> Example
> Say I am crawling pages and only refreshing pages based on a set score and some pages doe not get updated over X days the old version of the page gets removed from the data set.
> Say I am striping out links form html and storing them say a link is removed from a page then I would need to issue a delete statement to remove that links form the data set with a ttl the link data would remove its self if not updated in x secs. These are just examples based on crawling like nutch but I can foresee many apps using this option.
> This is a feature in bigtables thats is handled when bigtable does garbage-collection.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.