You are viewing a plain text version of this content. The canonical link for it is here.
Posted to common-dev@hadoop.apache.org by "Doug Cutting (JIRA)" <ji...@apache.org> on 2007/05/17 00:56:16 UTC

[jira] Commented: (HADOOP-1381) The distance between sync blocks in SequenceFiles should be configurable rather than hard coded to 2000 bytes

    [ https://issues.apache.org/jira/browse/HADOOP-1381?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12496429 ] 

Doug Cutting commented on HADOOP-1381:
--------------------------------------

Why would this be better?  The current design is to add them as frequently as possible without significantly impacting file size.  This minimizes the amount of data that must be scanned when sync'ing.  What would making it larger help?

> The distance between sync blocks in SequenceFiles should be configurable rather than hard coded to 2000 bytes
> -------------------------------------------------------------------------------------------------------------
>
>                 Key: HADOOP-1381
>                 URL: https://issues.apache.org/jira/browse/HADOOP-1381
>             Project: Hadoop
>          Issue Type: Improvement
>          Components: io
>            Reporter: Owen O'Malley
>             Fix For: 0.14.0
>
>
> Currently SequenceFiles put in sync blocks every 2000 bytes. It would be much better if it was configurable with a much higher default (1mb or so?).

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.