You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@cassandra.apache.org by "Jordan West (JIRA)" <ji...@apache.org> on 2015/05/15 22:53:59 UTC

[jira] [Assigned] (CASSANDRA-9406) Add Option to Not Validate Atoms During Scrub

     [ https://issues.apache.org/jira/browse/CASSANDRA-9406?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Jordan West reassigned CASSANDRA-9406:
--------------------------------------

    Assignee: Jordan West

> Add Option to Not Validate Atoms During Scrub
> ---------------------------------------------
>
>                 Key: CASSANDRA-9406
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-9406
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Tools
>            Reporter: Jordan West
>            Assignee: Jordan West
>            Priority: Minor
>             Fix For: 2.0.x
>
>
> In Scrubber, the instantiation of SSTableIdentityIterator hardcodes checkData to true. This should be made configurable when running scrub via JMX or StandaloneScrubber.
> Since inbound data is not validated, Scrub without this option will throw away data that is not corrupt, but "misrepresented" (e.g. an int is stored but validator = LongType), while Cassandra and application clients will happily continue to read and write data with this misrepresentation (although some care may need to be taken on the application side). Scrub will throw these rows out leading to a large amount of data loss. 
> In these applications it is desirable for scrub to check for row/file corruption but not validate the column values (which can result in a large percentage of data being thrown away). This would be made possible by adding such a flag to disable validation in the SSTableIdentityIterator



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)