You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@hbase.apache.org by "Tak-Lon (Stephen) Wu (Jira)" <ji...@apache.org> on 2021/09/09 18:35:00 UTC
[jira] [Created] (HBASE-26273)
TableSnapshotInputFormat/TableSnapshotInputFormatImpl should use
ReadType.STREAM for scanning HFiles
Tak-Lon (Stephen) Wu created HBASE-26273:
--------------------------------------------
Summary: TableSnapshotInputFormat/TableSnapshotInputFormatImpl should use ReadType.STREAM for scanning HFiles
Key: HBASE-26273
URL: https://issues.apache.org/jira/browse/HBASE-26273
Project: HBase
Issue Type: Improvement
Components: mapreduce
Affects Versions: 2.4.6, 3.0.0-alpha-1
Reporter: Tak-Lon (Stephen) Wu
After the change in HBASE-17917 that use PREAD ({{ReadType.DEFAULT}}) for all user scan, the behavior of TableSnapshotInputFormat changed from STREAM to PREAD.
TableSnapshotInputFormat is supposed to be use with a YARN/MR or other batch engine that should read the entire HFile in the container/executor, with default always to PREAD, the number of connection to HDFS surges and has an side-effect on the overall performance.
The goal of this change is to make any downstream using TableSnapshotInputFormat with STREAM scan.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)