You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@phoenix.apache.org by "Gabriel Reid (JIRA)" <ji...@apache.org> on 2014/07/24 10:47:39 UTC
[jira] [Created] (PHOENIX-1108) Clarify, verify, and document
intended behavior from using HColumnDescriptor.KEEP_DELETED_CELLS
Gabriel Reid created PHOENIX-1108:
-------------------------------------
Summary: Clarify, verify, and document intended behavior from using HColumnDescriptor.KEEP_DELETED_CELLS
Key: PHOENIX-1108
URL: https://issues.apache.org/jira/browse/PHOENIX-1108
Project: Phoenix
Issue Type: Improvement
Reporter: Gabriel Reid
Assignee: Gabriel Reid
The current default for all Phoenix tables is to enable the KEEP_DELETED_CELLS flag on all column families. The general functionality of this default should be reviewed, as well as checking that it works as intended (particularly in terms of the ChunkedResultIterator, which uses multiple scans).
The general idea of the KEEP_DELETED_CELLS flag is that it prevents deleted cells from being permanently removed during a (major) compaction. If the number of versions to keep for a cell is small (3 is the default) then this won’t cause a major problem, and is in might be needed in order to function correctly (i.e. to handle deletes and a major compaction occurring while a query is being run).
On the other hand, if the number of versions to keep for a column family is large (e.g. Integer.MAX_VALUE), the default of KEEP_DELETED_CELLS=true will mean that a delete in Phoenix never actually deletes data.
Tasks to be performed are:
* clear up (and document) the intended behavior that of using KEEP_DELETED_CELLS=true as a default in Phoenix
* add tests to verify that this intended behavior still works with the ChunkedResultIterator
* document the implications and/or workaround if a large number of versions is configured for a column family
--
This message was sent by Atlassian JIRA
(v6.2#6252)