You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@phoenix.apache.org by "Perko, Ralph J" <Ra...@pnnl.gov> on 2015/03/30 19:10:58 UTC
bulk load issue
Hi, I recently ran into a new issue with the csv bulk loader. The MapReduce jobs run fine but then the hbase loading portion seems to get stuck in a cycle of RpcRetryCaller cycles on the index tables.
Sample output – there are many of these for all index tables
15/03/30 09:55:21 INFO client.RpcRetryingCaller: Call exception, tries=20, retries=35, started=1424604 ms ago, cancelled=false, msg=row ' ' on table 'RAW_DATA_IDX' at region=RAW_DATA_IDX,\x09\x00\x00\x00\x00\x00\x00\x00,1427732357435.bfcbd84ad20046978ecf07b1a49b992c., hostname=server1,60020,1427726430154, seqNum=2
15/03/30 09:55:21 INFO client.RpcRetryingCaller: Call exception, tries=20, retries=35, started=1424636 ms ago, cancelled=false, msg=row '' on table 'RAW_DATA_IDX' at region=RAW_DATA_IDX,\x08\x00\x00\x00\x00\x00\x00\x00,1427732357435.bac4e9778524eac1de2c1bc6bba11fde., hostname=server1,60020,1427726430154, seqNum=2
I recently upgraded to phoenix 4.3 (and Hortonworks 2.2 – hbase 0.98.4.2.2.0.0) things worked prior to this.
Create statement (obfuscated a bit):
CREATE TABLE IF NOT EXISTS data_table
(
file_name VARCHAR NOT NULL,
rec_num INTEGER NOT NULL,
m.f1 VARCHAR,
m.f2 VARCHAR,
m.f3 VARCHAR,
m.f4 VARCHAR,
m.f5 VARCHAR,
m.f6 VARCHAR,
m.f7 VARCHAR
CONSTRAINT pkey PRIMARY KEY (file_name,rec_num)
) TTL='7776000',IMMUTABLE_ROWS=true,KEEP_DELETED_CELLS='false',COMPRESSION='SNAPPY',SALT_BUCKETS=10,SPLIT_POLICY='org.apache.hadoop.hbase.regionserver.ConstantSizeRegionSplitPolicy';
-- indexes
CREATE INDEX IF NOT EXISTS raw_data_idx ON data_table(m.f1) TTL='7776000',KEEP_DELETED_CELLS='false',COMPRESSION='SNAPPY',MAX_FILESIZE='1000000000',SPLIT_POLICY='org.apache.hadoop.hbase.regionserver.ConstantSizeRegionSplitPolicy';
CREATE INDEX IF NOT EXISTS data_table_f2f3_idx ON data_table(m.f2,m.f3) TTL='7776000',KEEP_DELETED_CELLS='false',COMPRESSION='SNAPPY',MAX_FILESIZE='1000000000',SPLIT_POLICY='org.apache.hadoop.hbase.regionserver.ConstantSizeRegionSplitPolicy';
CREATE INDEX IF NOT EXISTS data_table_f4f5_idx ON data_table(m.f4,m.f5) TTL='7776000',KEEP_DELETED_CELLS='false',COMPRESSION='SNAPPY',MAX_FILESIZE='1000000000',SPLIT_POLICY='org.apache.hadoop.hbase.regionserver.ConstantSizeRegionSplitPolicy';
CREATE INDEX IF NOT EXISTS data_table_f6f7_idx ON data_table(m.f6,m.f7) TTL='7776000',KEEP_DELETED_CELLS='false',COMPRESSION='SNAPPY',MAX_FILESIZE='1000000000',SPLIT_POLICY='org.apache.hadoop.hbase.regionserver.ConstantSizeRegionSplitPolicy';
Any thoughts on what could be causing this?
Thanks,
Ralph
Re: bulk load issue
Posted by "Perko, Ralph J" <Ra...@pnnl.gov>.
Fixed. The issue was with hbase and the WAL. I shutdown hbase, deleted the WAL and things work fine now.
Ralph
From: <Perko>, Ralph Perko <ra...@pnnl.gov>>
Reply-To: "user@phoenix.apache.org<ma...@phoenix.apache.org>" <us...@phoenix.apache.org>>
Date: Monday, March 30, 2015 at 10:10 AM
To: "user@phoenix.apache.org<ma...@phoenix.apache.org>" <us...@phoenix.apache.org>>
Subject: bulk load issue
Hi, I recently ran into a new issue with the csv bulk loader. The MapReduce jobs run fine but then the hbase loading portion seems to get stuck in a cycle of RpcRetryCaller cycles on the index tables.
Sample output – there are many of these for all index tables
15/03/30 09:55:21 INFO client.RpcRetryingCaller: Call exception, tries=20, retries=35, started=1424604 ms ago, cancelled=false, msg=row '' on table 'RAW_DATA_IDX' at region=RAW_DATA_IDX,\x09\x00\x00\x00\x00\x00\x00\x00,1427732357435.bfcbd84ad20046978ecf07b1a49b992c., hostname=server1,60020,1427726430154, seqNum=2
15/03/30 09:55:21 INFO client.RpcRetryingCaller: Call exception, tries=20, retries=35, started=1424636 ms ago, cancelled=false, msg=row '' on table 'RAW_DATA_IDX' at region=RAW_DATA_IDX,\x08\x00\x00\x00\x00\x00\x00\x00,1427732357435.bac4e9778524eac1de2c1bc6bba11fde., hostname=server1,60020,1427726430154, seqNum=2
I recently upgraded to phoenix 4.3 (and Hortonworks 2.2 – hbase 0.98.4.2.2.0.0) things worked prior to this.
Create statement (obfuscated a bit):
CREATE TABLE IF NOT EXISTS data_table
(
file_name VARCHAR NOT NULL,
rec_num INTEGER NOT NULL,
m.f1 VARCHAR,
m.f2 VARCHAR,
m.f3 VARCHAR,
m.f4 VARCHAR,
m.f5 VARCHAR,
m.f6 VARCHAR,
m.f7 VARCHAR
CONSTRAINT pkey PRIMARY KEY (file_name,rec_num)
) TTL='7776000',IMMUTABLE_ROWS=true,KEEP_DELETED_CELLS='false',COMPRESSION='SNAPPY',SALT_BUCKETS=10,SPLIT_POLICY='org.apache.hadoop.hbase.regionserver.ConstantSizeRegionSplitPolicy';
-- indexes
CREATE INDEX IF NOT EXISTS raw_data_idx ON data_table(m.f1) TTL='7776000',KEEP_DELETED_CELLS='false',COMPRESSION='SNAPPY',MAX_FILESIZE='1000000000',SPLIT_POLICY='org.apache.hadoop.hbase.regionserver.ConstantSizeRegionSplitPolicy';
CREATE INDEX IF NOT EXISTS data_table_f2f3_idx ON data_table(m.f2,m.f3) TTL='7776000',KEEP_DELETED_CELLS='false',COMPRESSION='SNAPPY',MAX_FILESIZE='1000000000',SPLIT_POLICY='org.apache.hadoop.hbase.regionserver.ConstantSizeRegionSplitPolicy';
CREATE INDEX IF NOT EXISTS data_table_f4f5_idx ON data_table(m.f4,m.f5) TTL='7776000',KEEP_DELETED_CELLS='false',COMPRESSION='SNAPPY',MAX_FILESIZE='1000000000',SPLIT_POLICY='org.apache.hadoop.hbase.regionserver.ConstantSizeRegionSplitPolicy';
CREATE INDEX IF NOT EXISTS data_table_f6f7_idx ON data_table(m.f6,m.f7) TTL='7776000',KEEP_DELETED_CELLS='false',COMPRESSION='SNAPPY',MAX_FILESIZE='1000000000',SPLIT_POLICY='org.apache.hadoop.hbase.regionserver.ConstantSizeRegionSplitPolicy';
Any thoughts on what could be causing this?
Thanks,
Ralph