You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@hadoop.apache.org by Adam Silberstein <ad...@trifacta.com> on 2015/01/29 02:33:20 UTC

HDFS seek perf question

Hi,
I have a question about hdfs seek performance.  I see some info on this
periodically, but nothing too recent.

How do these costs compare?
A) seeking to the start of an HDFS block and reading about 10MB of data
B) reading the entire HDFS block

Assuming A is faster, how many random seeks can you do against an HDFS
block before that is slower than reading the whole thing?  On paper that
can be computed using the disk's speed numbers but would like to know how
well in practice HDFS matches that behavior.

Thanks,
Adam