You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@hbase.apache.org by "nilonealex (Jira)" <ji...@apache.org> on 2020/12/01 09:55:00 UTC
[jira] [Created] (HBASE-25346) hbase2.x the speed of writing data
is slower than version 1.x
nilonealex created HBASE-25346:
----------------------------------
Summary: hbase2.x the speed of writing data is slower than version 1.x
Key: HBASE-25346
URL: https://issues.apache.org/jira/browse/HBASE-25346
Project: HBase
Issue Type: Improvement
Affects Versions: 2.0.2
Reporter: nilonealex
Attachments: hbase-site.xml
I am doing data load and query performance verification based on Hbase2.0.2 ( HDP3.1.1) & Hbase1.2.0 ( CDH5.13.3 ), and found that put data based on hbase2.0 is much slower than hbase1.x (the former is almost half of the latter), I use BufferedMutator and BufferedMutatorParams syntax for batch [put to improve efficiency. Some of the codes are as follows:
-----------------------------------------------------------------------
List<Mutation> mutator = new ArrayList<>();
BufferedMutator table = null;
BufferedMutatorParams params = new BufferedMutatorParams(TableName.valueOf(fileHbRule.getHbaseTableName()));
params.writeBufferSize(fileHbRule.getFlushBuffer().intValue()*1024*1024);
table = connection.getBufferedMutator(params);
mutator.add(p);
if(totalCnts % 5000 == 0 ) {
table.mutate(mutator);
mutator.clear();
}
-----------------------------------------------------------------------
The file to put is a text format file: 2 million rows comma-separated text file, each row records 110 columns, total size is about 1G. In addition to the main parameter configuration such as heap memory, I kept the default parameter values ??for most of the hbase services.
The load program is designed for single thread.
The following is the progress information :
----------------------- Hbase1.2.0 ( CDH5.13.3 ) --------------------------------------------
2020-12-01 16:48:18 inserted: 100000
2020-12-01 16:48:36 inserted: 200000
2020-12-01 16:48:52 inserted: 300000
2020-12-01 16:49:08 inserted: 400000
2020-12-01 16:49:23 inserted: 500000
2020-12-01 16:49:39 inserted: 600000
2020-12-01 16:49:56 inserted: 700000
2020-12-01 16:50:12 inserted: 800000
2020-12-01 16:50:29 inserted: 900000
2020-12-01 16:50:45 inserted: 1000000
2020-12-01 16:51:01 inserted: 1100000
2020-12-01 16:51:17 inserted: 1200000
2020-12-01 16:51:34 inserted: 1300000
2020-12-01 16:51:49 inserted: 1400000
2020-12-01 16:52:05 inserted: 1500000
2020-12-01 16:52:21 inserted: 1600000
2020-12-01 16:52:40 inserted: 1700000
2020-12-01 16:52:57 inserted: 1800000
2020-12-01 16:53:19 inserted: 1900000
2020-12-01 16:53:42 inserted: 2000000
2020-12-01 16:53:48 inserted: 2000000
imp finished ok!
--job finished--
-----------------------Hbase.2.0.2 ( HDP3.1.1)--------------------------------------------------------------
2020-12-01 17:25:24 inserted: 100000
2020-12-01 17:26:03 inserted: 200000
2020-12-01 17:26:39 inserted: 300000
2020-12-01 17:27:13 inserted: 400000
2020-12-01 17:27:47 inserted: 500000
2020-12-01 17:28:23 inserted: 600000
2020-12-01 17:29:03 inserted: 700000
2020-12-01 17:29:40 inserted: 800000
2020-12-01 17:30:15 inserted: 900000
2020-12-01 17:30:51 inserted: 1000000
2020-12-01 17:31:27 inserted: 1100000
2020-12-01 17:32:03 inserted: 1200000
2020-12-01 17:32:39 inserted: 1300000
2020-12-01 17:33:14 inserted: 1400000
2020-12-01 17:33:50 inserted: 1500000
2020-12-01 17:34:25 inserted: 1600000
2020-12-01 17:35:01 inserted: 1700000
2020-12-01 17:35:38 inserted: 1800000
2020-12-01 17:36:14 inserted: 1900000
2020-12-01 17:36:51 inserted: 2000000
2020-12-01 17:36:55 inserted: 2000000
imp finished ok!
--job finished--
returnCode=0
Are there any key points that I have not done configuration? or,, this version has performance defects ?
--
This message was sent by Atlassian Jira
(v8.3.4#803005)