You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@phoenix.apache.org by Sriram Nookala <sr...@firstfuel.com> on 2017/09/06 20:13:09 UTC

Phoenix CSV Bulk Load fails to load a large file

I'm trying to load a 3.5G file with 60 million rows using CsvBulkLoadTool.
It hangs while loading HFiles. This runs successfully if I split this into
2 files, but I'd like to avoid doing that. This is on Amazon EMR, is this
an issue due to disk space or memory. I have a single master and 2 region
server configuration with 16 GB memory on each node.

Re: Phoenix CSV Bulk Load fails to load a large file

Posted by Sriram Nookala <sr...@firstfuel.com>.
Thanks, setting hbase.bulkload.retries.retryOnIOException  to true in the
configuration worked. My Hbase cluster is colocated with the Yarn cluster
on EMR.

On Thu, Sep 7, 2017 at 4:08 AM, Ankit Singhal <an...@gmail.com>
wrote:

> bq. This runs successfully if I split this into 2 files, but I'd like to
> avoid doing that.
> do you run a different job for each file?
>
> if your HBase cluster is not co-located with your yarn cluster then it may
> be possible that copying of large HFile is timing out(this may happen due
> to the fewer regions in HBase table or hot-spotting). you can check your
> output directory for file sizes and see if you are hitting this problem.
> Consider increasing the time-out or splitting the hot region.
>
>
>
>
> On Thu, Sep 7, 2017 at 5:25 AM, Ted Yu <yu...@gmail.com> wrote:
>
>> bq. hbase.bulkload.retries.retryOnIOException is disabled. Unable to
>> recover
>>
>> The above is from HBASE-17165.
>>
>> See if the load can pass after enabling the config.
>>
>> On Wed, Sep 6, 2017 at 3:11 PM, Sriram Nookala <sr...@firstfuel.com>
>> wrote:
>>
>>> It finally times out with these exceptions
>>>
>>> ed Sep 06 21:38:07 UTC 2017, RpcRetryingCaller{globalStartTime=1504731276347,
>>> pause=100, retries=35}, java.io.IOException: Call to
>>> ip-10-123-0-60.ec2.internal/10.123.0.60:16020 failed on local
>>> exception: org.apache.hadoop.hbase.ipc.CallTimeoutException: Call
>>> id=77, waitTime=60001, operationTimeout=60000 expired.
>>>
>>>
>>> at org.apache.hadoop.hbase.client.RpcRetryingCaller.callWithRet
>>> ries(RpcRetryingCaller.java:159)
>>>
>>> at org.apache.hadoop.hbase.mapreduce.LoadIncrementalHFiles.tryA
>>> tomicRegionLoad(LoadIncrementalHFiles.java:956)
>>>
>>> at org.apache.hadoop.hbase.mapreduce.LoadIncrementalHFiles$2.ca
>>> ll(LoadIncrementalHFiles.java:594)
>>>
>>> at org.apache.hadoop.hbase.mapreduce.LoadIncrementalHFiles$2.ca
>>> ll(LoadIncrementalHFiles.java:590)
>>>
>>> at java.util.concurrent.FutureTask.run(FutureTask.java:266)
>>>
>>> at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPool
>>> Executor.java:1142)
>>>
>>> at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoo
>>> lExecutor.java:617)
>>>
>>> at java.lang.Thread.run(Thread.java:745)
>>>
>>> Caused by: java.io.IOException: Call to ip-10-123-0-60.ec2.internal/10
>>> .123.0.60:16020 failed on local exception:
>>> org.apache.hadoop.hbase.ipc.CallTimeoutException: Call id=77,
>>> waitTime=60001, operationTimeout=60000 expired.
>>>
>>> at org.apache.hadoop.hbase.ipc.AbstractRpcClient.wrapException(
>>> AbstractRpcClient.java:292)
>>>
>>> at org.apache.hadoop.hbase.ipc.RpcClientImpl.call(RpcClientImpl
>>> .java:1274)
>>>
>>> at org.apache.hadoop.hbase.ipc.AbstractRpcClient.callBlockingMe
>>> thod(AbstractRpcClient.java:227)
>>>
>>> at org.apache.hadoop.hbase.ipc.AbstractRpcClient$BlockingRpcCha
>>> nnelImplementation.callBlockingMethod(AbstractRpcClient.java:336)
>>>
>>> at org.apache.hadoop.hbase.protobuf.generated.ClientProtos$Clie
>>> ntService$BlockingStub.bulkLoadHFile(ClientProtos.java:35408)
>>>
>>> at org.apache.hadoop.hbase.protobuf.ProtobufUtil.bulkLoadHFile(
>>> ProtobufUtil.java:1676)
>>>
>>> at org.apache.hadoop.hbase.mapreduce.LoadIncrementalHFiles$3.ca
>>> ll(LoadIncrementalHFiles.java:656)
>>>
>>> at org.apache.hadoop.hbase.mapreduce.LoadIncrementalHFiles$3.ca
>>> ll(LoadIncrementalHFiles.java:645)
>>>
>>> at org.apache.hadoop.hbase.client.RpcRetryingCaller.callWithRet
>>> ries(RpcRetryingCaller.java:137)
>>>
>>> ... 7 more
>>>
>>> Caused by: org.apache.hadoop.hbase.ipc.CallTimeoutException: Call
>>> id=77, waitTime=60001, operationTimeout=60000 expired.
>>>
>>> at org.apache.hadoop.hbase.ipc.Call.checkAndSetTimeout(Call.java:73)
>>>
>>> at org.apache.hadoop.hbase.ipc.RpcClientImpl.call(RpcClientImpl
>>> .java:1248)
>>>
>>> ... 14 more
>>>
>>> 17/09/06 21:38:07 ERROR mapreduce.LoadIncrementalHFiles:
>>> hbase.bulkload.retries.retryOnIOException is disabled. Unable to recover
>>>
>>> 17/09/06 21:38:07 INFO zookeeper.ZooKeeper: Session: 0x15e58ca21fc004c
>>> closed
>>>
>>> 17/09/06 21:38:07 INFO zookeeper.ClientCnxn: EventThread shut down
>>>
>>> Exception in thread "main" java.io.IOException: BulkLoad encountered an
>>> unrecoverable problem
>>>
>>> at org.apache.hadoop.hbase.mapreduce.LoadIncrementalHFiles.bulk
>>> LoadPhase(LoadIncrementalHFiles.java:614)
>>>
>>> at org.apache.hadoop.hbase.mapreduce.LoadIncrementalHFiles.doBu
>>> lkLoad(LoadIncrementalHFiles.java:463)
>>>
>>> at org.apache.hadoop.hbase.mapreduce.LoadIncrementalHFiles.doBu
>>> lkLoad(LoadIncrementalHFiles.java:373)
>>>
>>> at org.apache.phoenix.mapreduce.AbstractBulkLoadTool.completebu
>>> lkload(AbstractBulkLoadTool.java:355)
>>>
>>> at org.apache.phoenix.mapreduce.AbstractBulkLoadTool.submitJob(
>>> AbstractBulkLoadTool.java:332)
>>>
>>> at org.apache.phoenix.mapreduce.AbstractBulkLoadTool.loadData(A
>>> bstractBulkLoadTool.java:270)
>>>
>>> at org.apache.phoenix.mapreduce.AbstractBulkLoadTool.run(Abstra
>>> ctBulkLoadTool.java:183)
>>>
>>> at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)
>>>
>>> at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:84)
>>>
>>> at org.apache.phoenix.mapreduce.CsvBulkLoadTool.main(CsvBulkLoa
>>> dTool.java:101)
>>>
>>> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>>>
>>> at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAcce
>>> ssorImpl.java:62)
>>>
>>> at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMe
>>> thodAccessorImpl.java:43)
>>>
>>> at java.lang.reflect.Method.invoke(Method.java:498)
>>>
>>> at org.apache.hadoop.util.RunJar.run(RunJar.java:221)
>>>
>>> at org.apache.hadoop.util.RunJar.main(RunJar.java:136)
>>>
>>> Caused by: org.apache.hadoop.hbase.client.RetriesExhaustedException:
>>> Failed after attempts=35, exceptions:
>>>
>>> Wed Sep 06 20:55:36 UTC 2017, RpcRetryingCaller{globalStartTime=1504731276347,
>>> pause=100, retries=35}, java.io.IOException: Call to
>>> ip-10-123-0-60.ec2.internal/10.123.0.60:16020 failed on local
>>> exception: org.apache.hadoop.hbase.ipc.CallTimeoutException: Call id=9,
>>> waitTime=60002, operationTimeout=60000 expired.
>>>
>>> On Wed, Sep 6, 2017 at 5:01 PM, Sriram Nookala <sr...@firstfuel.com>
>>> wrote:
>>>
>>>> Phoenix 4.11.0, HBase 1.3.1
>>>>
>>>> This is what I get from jstack
>>>>
>>>> "main" #1 prio=5 os_prio=0 tid=0x00007fb3d0017000 nid=0x5de7 waiting on
>>>> condition [0x00007fb3d75f7000]
>>>>
>>>>    java.lang.Thread.State: WAITING (parking)
>>>>
>>>> at sun.misc.Unsafe.park(Native Method)
>>>>
>>>> - parking to wait for  <0x00000000f2222588> (a
>>>> java.util.concurrent.FutureTask)
>>>>
>>>> at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175)
>>>>
>>>> at java.util.concurrent.FutureTask.awaitDone(FutureTask.java:429)
>>>>
>>>> at java.util.concurrent.FutureTask.get(FutureTask.java:191)
>>>>
>>>> at org.apache.hadoop.hbase.mapreduce.LoadIncrementalHFiles.bulk
>>>> LoadPhase(LoadIncrementalHFiles.java:604)
>>>>
>>>> at org.apache.hadoop.hbase.mapreduce.LoadIncrementalHFiles.doBu
>>>> lkLoad(LoadIncrementalHFiles.java:463)
>>>>
>>>> at org.apache.hadoop.hbase.mapreduce.LoadIncrementalHFiles.doBu
>>>> lkLoad(LoadIncrementalHFiles.java:373)
>>>>
>>>> at org.apache.phoenix.mapreduce.AbstractBulkLoadTool.completebu
>>>> lkload(AbstractBulkLoadTool.java:355)
>>>>
>>>> at org.apache.phoenix.mapreduce.AbstractBulkLoadTool.submitJob(
>>>> AbstractBulkLoadTool.java:332)
>>>>
>>>> at org.apache.phoenix.mapreduce.AbstractBulkLoadTool.loadData(A
>>>> bstractBulkLoadTool.java:270)
>>>>
>>>> at org.apache.phoenix.mapreduce.AbstractBulkLoadTool.run(Abstra
>>>> ctBulkLoadTool.java:183)
>>>>
>>>> at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)
>>>>
>>>> at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:84)
>>>>
>>>> at org.apache.phoenix.mapreduce.CsvBulkLoadTool.main(CsvBulkLoa
>>>> dTool.java:101)
>>>>
>>>> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>>>>
>>>> at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAcce
>>>> ssorImpl.java:62)
>>>>
>>>> at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMe
>>>> thodAccessorImpl.java:43)
>>>>
>>>> at java.lang.reflect.Method.invoke(Method.java:498)
>>>>
>>>> at org.apache.hadoop.util.RunJar.run(RunJar.java:221)
>>>>
>>>> at org.apache.hadoop.util.RunJar.main(RunJar.java:136)
>>>>
>>>>
>>>>
>>>>
>>>> On Wed, Sep 6, 2017 at 4:16 PM, Sergey Soldatov <
>>>> sergeysoldatov@gmail.com> wrote:
>>>>
>>>>> Do you have more details on the version of Phoenix/HBase you are using
>>>>> as well as how it hangs (Exceptions/messages that may help to understand
>>>>> the problem)?
>>>>>
>>>>> Thanks,
>>>>> Sergey
>>>>>
>>>>> On Wed, Sep 6, 2017 at 1:13 PM, Sriram Nookala <sr...@firstfuel.com>
>>>>> wrote:
>>>>>
>>>>>> I'm trying to load a 3.5G file with 60 million rows using
>>>>>> CsvBulkLoadTool. It hangs while loading HFiles. This runs successfully if I
>>>>>> split this into 2 files, but I'd like to avoid doing that. This is on
>>>>>> Amazon EMR, is this an issue due to disk space or memory. I have a single
>>>>>> master and 2 region server configuration with 16 GB memory on each node.
>>>>>>
>>>>>
>>>>>
>>>>
>>>
>>
>

Re: Phoenix CSV Bulk Load fails to load a large file

Posted by Ankit Singhal <an...@gmail.com>.
bq. This runs successfully if I split this into 2 files, but I'd like to
avoid doing that.
do you run a different job for each file?

if your HBase cluster is not co-located with your yarn cluster then it may
be possible that copying of large HFile is timing out(this may happen due
to the fewer regions in HBase table or hot-spotting). you can check your
output directory for file sizes and see if you are hitting this problem.
Consider increasing the time-out or splitting the hot region.




On Thu, Sep 7, 2017 at 5:25 AM, Ted Yu <yu...@gmail.com> wrote:

> bq. hbase.bulkload.retries.retryOnIOException is disabled. Unable to
> recover
>
> The above is from HBASE-17165.
>
> See if the load can pass after enabling the config.
>
> On Wed, Sep 6, 2017 at 3:11 PM, Sriram Nookala <sr...@firstfuel.com>
> wrote:
>
>> It finally times out with these exceptions
>>
>> ed Sep 06 21:38:07 UTC 2017, RpcRetryingCaller{globalStartTime=1504731276347,
>> pause=100, retries=35}, java.io.IOException: Call to
>> ip-10-123-0-60.ec2.internal/10.123.0.60:16020 failed on local exception:
>> org.apache.hadoop.hbase.ipc.CallTimeoutException: Call id=77,
>> waitTime=60001, operationTimeout=60000 expired.
>>
>>
>> at org.apache.hadoop.hbase.client.RpcRetryingCaller.callWithRet
>> ries(RpcRetryingCaller.java:159)
>>
>> at org.apache.hadoop.hbase.mapreduce.LoadIncrementalHFiles.tryA
>> tomicRegionLoad(LoadIncrementalHFiles.java:956)
>>
>> at org.apache.hadoop.hbase.mapreduce.LoadIncrementalHFiles$2.
>> call(LoadIncrementalHFiles.java:594)
>>
>> at org.apache.hadoop.hbase.mapreduce.LoadIncrementalHFiles$2.
>> call(LoadIncrementalHFiles.java:590)
>>
>> at java.util.concurrent.FutureTask.run(FutureTask.java:266)
>>
>> at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPool
>> Executor.java:1142)
>>
>> at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoo
>> lExecutor.java:617)
>>
>> at java.lang.Thread.run(Thread.java:745)
>>
>> Caused by: java.io.IOException: Call to ip-10-123-0-60.ec2.internal/10
>> .123.0.60:16020 failed on local exception: org.apache.hadoop.hbase.ipc.CallTimeoutException:
>> Call id=77, waitTime=60001, operationTimeout=60000 expired.
>>
>> at org.apache.hadoop.hbase.ipc.AbstractRpcClient.wrapException(
>> AbstractRpcClient.java:292)
>>
>> at org.apache.hadoop.hbase.ipc.RpcClientImpl.call(RpcClientImpl
>> .java:1274)
>>
>> at org.apache.hadoop.hbase.ipc.AbstractRpcClient.callBlockingMe
>> thod(AbstractRpcClient.java:227)
>>
>> at org.apache.hadoop.hbase.ipc.AbstractRpcClient$BlockingRpcCha
>> nnelImplementation.callBlockingMethod(AbstractRpcClient.java:336)
>>
>> at org.apache.hadoop.hbase.protobuf.generated.ClientProtos$
>> ClientService$BlockingStub.bulkLoadHFile(ClientProtos.java:35408)
>>
>> at org.apache.hadoop.hbase.protobuf.ProtobufUtil.bulkLoadHFile(
>> ProtobufUtil.java:1676)
>>
>> at org.apache.hadoop.hbase.mapreduce.LoadIncrementalHFiles$3.
>> call(LoadIncrementalHFiles.java:656)
>>
>> at org.apache.hadoop.hbase.mapreduce.LoadIncrementalHFiles$3.
>> call(LoadIncrementalHFiles.java:645)
>>
>> at org.apache.hadoop.hbase.client.RpcRetryingCaller.callWithRet
>> ries(RpcRetryingCaller.java:137)
>>
>> ... 7 more
>>
>> Caused by: org.apache.hadoop.hbase.ipc.CallTimeoutException: Call id=77,
>> waitTime=60001, operationTimeout=60000 expired.
>>
>> at org.apache.hadoop.hbase.ipc.Call.checkAndSetTimeout(Call.java:73)
>>
>> at org.apache.hadoop.hbase.ipc.RpcClientImpl.call(RpcClientImpl
>> .java:1248)
>>
>> ... 14 more
>>
>> 17/09/06 21:38:07 ERROR mapreduce.LoadIncrementalHFiles:
>> hbase.bulkload.retries.retryOnIOException is disabled. Unable to recover
>>
>> 17/09/06 21:38:07 INFO zookeeper.ZooKeeper: Session: 0x15e58ca21fc004c
>> closed
>>
>> 17/09/06 21:38:07 INFO zookeeper.ClientCnxn: EventThread shut down
>>
>> Exception in thread "main" java.io.IOException: BulkLoad encountered an
>> unrecoverable problem
>>
>> at org.apache.hadoop.hbase.mapreduce.LoadIncrementalHFiles.bulk
>> LoadPhase(LoadIncrementalHFiles.java:614)
>>
>> at org.apache.hadoop.hbase.mapreduce.LoadIncrementalHFiles.doBu
>> lkLoad(LoadIncrementalHFiles.java:463)
>>
>> at org.apache.hadoop.hbase.mapreduce.LoadIncrementalHFiles.doBu
>> lkLoad(LoadIncrementalHFiles.java:373)
>>
>> at org.apache.phoenix.mapreduce.AbstractBulkLoadTool.completebu
>> lkload(AbstractBulkLoadTool.java:355)
>>
>> at org.apache.phoenix.mapreduce.AbstractBulkLoadTool.submitJob(
>> AbstractBulkLoadTool.java:332)
>>
>> at org.apache.phoenix.mapreduce.AbstractBulkLoadTool.loadData(A
>> bstractBulkLoadTool.java:270)
>>
>> at org.apache.phoenix.mapreduce.AbstractBulkLoadTool.run(Abstra
>> ctBulkLoadTool.java:183)
>>
>> at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)
>>
>> at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:84)
>>
>> at org.apache.phoenix.mapreduce.CsvBulkLoadTool.main(CsvBulkLoa
>> dTool.java:101)
>>
>> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>>
>> at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAcce
>> ssorImpl.java:62)
>>
>> at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMe
>> thodAccessorImpl.java:43)
>>
>> at java.lang.reflect.Method.invoke(Method.java:498)
>>
>> at org.apache.hadoop.util.RunJar.run(RunJar.java:221)
>>
>> at org.apache.hadoop.util.RunJar.main(RunJar.java:136)
>>
>> Caused by: org.apache.hadoop.hbase.client.RetriesExhaustedException:
>> Failed after attempts=35, exceptions:
>>
>> Wed Sep 06 20:55:36 UTC 2017, RpcRetryingCaller{globalStartTime=1504731276347,
>> pause=100, retries=35}, java.io.IOException: Call to
>> ip-10-123-0-60.ec2.internal/10.123.0.60:16020 failed on local exception:
>> org.apache.hadoop.hbase.ipc.CallTimeoutException: Call id=9,
>> waitTime=60002, operationTimeout=60000 expired.
>>
>> On Wed, Sep 6, 2017 at 5:01 PM, Sriram Nookala <sr...@firstfuel.com>
>> wrote:
>>
>>> Phoenix 4.11.0, HBase 1.3.1
>>>
>>> This is what I get from jstack
>>>
>>> "main" #1 prio=5 os_prio=0 tid=0x00007fb3d0017000 nid=0x5de7 waiting on
>>> condition [0x00007fb3d75f7000]
>>>
>>>    java.lang.Thread.State: WAITING (parking)
>>>
>>> at sun.misc.Unsafe.park(Native Method)
>>>
>>> - parking to wait for  <0x00000000f2222588> (a
>>> java.util.concurrent.FutureTask)
>>>
>>> at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175)
>>>
>>> at java.util.concurrent.FutureTask.awaitDone(FutureTask.java:429)
>>>
>>> at java.util.concurrent.FutureTask.get(FutureTask.java:191)
>>>
>>> at org.apache.hadoop.hbase.mapreduce.LoadIncrementalHFiles.bulk
>>> LoadPhase(LoadIncrementalHFiles.java:604)
>>>
>>> at org.apache.hadoop.hbase.mapreduce.LoadIncrementalHFiles.doBu
>>> lkLoad(LoadIncrementalHFiles.java:463)
>>>
>>> at org.apache.hadoop.hbase.mapreduce.LoadIncrementalHFiles.doBu
>>> lkLoad(LoadIncrementalHFiles.java:373)
>>>
>>> at org.apache.phoenix.mapreduce.AbstractBulkLoadTool.completebu
>>> lkload(AbstractBulkLoadTool.java:355)
>>>
>>> at org.apache.phoenix.mapreduce.AbstractBulkLoadTool.submitJob(
>>> AbstractBulkLoadTool.java:332)
>>>
>>> at org.apache.phoenix.mapreduce.AbstractBulkLoadTool.loadData(A
>>> bstractBulkLoadTool.java:270)
>>>
>>> at org.apache.phoenix.mapreduce.AbstractBulkLoadTool.run(Abstra
>>> ctBulkLoadTool.java:183)
>>>
>>> at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)
>>>
>>> at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:84)
>>>
>>> at org.apache.phoenix.mapreduce.CsvBulkLoadTool.main(CsvBulkLoa
>>> dTool.java:101)
>>>
>>> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>>>
>>> at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAcce
>>> ssorImpl.java:62)
>>>
>>> at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMe
>>> thodAccessorImpl.java:43)
>>>
>>> at java.lang.reflect.Method.invoke(Method.java:498)
>>>
>>> at org.apache.hadoop.util.RunJar.run(RunJar.java:221)
>>>
>>> at org.apache.hadoop.util.RunJar.main(RunJar.java:136)
>>>
>>>
>>>
>>>
>>> On Wed, Sep 6, 2017 at 4:16 PM, Sergey Soldatov <
>>> sergeysoldatov@gmail.com> wrote:
>>>
>>>> Do you have more details on the version of Phoenix/HBase you are using
>>>> as well as how it hangs (Exceptions/messages that may help to understand
>>>> the problem)?
>>>>
>>>> Thanks,
>>>> Sergey
>>>>
>>>> On Wed, Sep 6, 2017 at 1:13 PM, Sriram Nookala <sr...@firstfuel.com>
>>>> wrote:
>>>>
>>>>> I'm trying to load a 3.5G file with 60 million rows using
>>>>> CsvBulkLoadTool. It hangs while loading HFiles. This runs successfully if I
>>>>> split this into 2 files, but I'd like to avoid doing that. This is on
>>>>> Amazon EMR, is this an issue due to disk space or memory. I have a single
>>>>> master and 2 region server configuration with 16 GB memory on each node.
>>>>>
>>>>
>>>>
>>>
>>
>

Re: Phoenix CSV Bulk Load fails to load a large file

Posted by Ted Yu <yu...@gmail.com>.
bq. hbase.bulkload.retries.retryOnIOException is disabled. Unable to recover

The above is from HBASE-17165.

See if the load can pass after enabling the config.

On Wed, Sep 6, 2017 at 3:11 PM, Sriram Nookala <sr...@firstfuel.com> wrote:

> It finally times out with these exceptions
>
> ed Sep 06 21:38:07 UTC 2017, RpcRetryingCaller{globalStartTime=1504731276347,
> pause=100, retries=35}, java.io.IOException: Call to
> ip-10-123-0-60.ec2.internal/10.123.0.60:16020 failed on local exception:
> org.apache.hadoop.hbase.ipc.CallTimeoutException: Call id=77,
> waitTime=60001, operationTimeout=60000 expired.
>
>
> at org.apache.hadoop.hbase.client.RpcRetryingCaller.callWithRetries(
> RpcRetryingCaller.java:159)
>
> at org.apache.hadoop.hbase.mapreduce.LoadIncrementalHFiles.
> tryAtomicRegionLoad(LoadIncrementalHFiles.java:956)
>
> at org.apache.hadoop.hbase.mapreduce.LoadIncrementalHFiles$2.call(
> LoadIncrementalHFiles.java:594)
>
> at org.apache.hadoop.hbase.mapreduce.LoadIncrementalHFiles$2.call(
> LoadIncrementalHFiles.java:590)
>
> at java.util.concurrent.FutureTask.run(FutureTask.java:266)
>
> at java.util.concurrent.ThreadPoolExecutor.runWorker(
> ThreadPoolExecutor.java:1142)
>
> at java.util.concurrent.ThreadPoolExecutor$Worker.run(
> ThreadPoolExecutor.java:617)
>
> at java.lang.Thread.run(Thread.java:745)
>
> Caused by: java.io.IOException: Call to ip-10-123-0-60.ec2.internal/10
> .123.0.60:16020 failed on local exception: org.apache.hadoop.hbase.ipc.CallTimeoutException:
> Call id=77, waitTime=60001, operationTimeout=60000 expired.
>
> at org.apache.hadoop.hbase.ipc.AbstractRpcClient.wrapException(
> AbstractRpcClient.java:292)
>
> at org.apache.hadoop.hbase.ipc.RpcClientImpl.call(RpcClientImpl.java:1274)
>
> at org.apache.hadoop.hbase.ipc.AbstractRpcClient.callBlockingMethod(
> AbstractRpcClient.java:227)
>
> at org.apache.hadoop.hbase.ipc.AbstractRpcClient$
> BlockingRpcChannelImplementation.callBlockingMethod(
> AbstractRpcClient.java:336)
>
> at org.apache.hadoop.hbase.protobuf.generated.ClientProtos$ClientService$
> BlockingStub.bulkLoadHFile(ClientProtos.java:35408)
>
> at org.apache.hadoop.hbase.protobuf.ProtobufUtil.
> bulkLoadHFile(ProtobufUtil.java:1676)
>
> at org.apache.hadoop.hbase.mapreduce.LoadIncrementalHFiles$3.call(
> LoadIncrementalHFiles.java:656)
>
> at org.apache.hadoop.hbase.mapreduce.LoadIncrementalHFiles$3.call(
> LoadIncrementalHFiles.java:645)
>
> at org.apache.hadoop.hbase.client.RpcRetryingCaller.callWithRetries(
> RpcRetryingCaller.java:137)
>
> ... 7 more
>
> Caused by: org.apache.hadoop.hbase.ipc.CallTimeoutException: Call id=77,
> waitTime=60001, operationTimeout=60000 expired.
>
> at org.apache.hadoop.hbase.ipc.Call.checkAndSetTimeout(Call.java:73)
>
> at org.apache.hadoop.hbase.ipc.RpcClientImpl.call(RpcClientImpl.java:1248)
>
> ... 14 more
>
> 17/09/06 21:38:07 ERROR mapreduce.LoadIncrementalHFiles:
> hbase.bulkload.retries.retryOnIOException is disabled. Unable to recover
>
> 17/09/06 21:38:07 INFO zookeeper.ZooKeeper: Session: 0x15e58ca21fc004c
> closed
>
> 17/09/06 21:38:07 INFO zookeeper.ClientCnxn: EventThread shut down
>
> Exception in thread "main" java.io.IOException: BulkLoad encountered an
> unrecoverable problem
>
> at org.apache.hadoop.hbase.mapreduce.LoadIncrementalHFiles.bulkLoadPhase(
> LoadIncrementalHFiles.java:614)
>
> at org.apache.hadoop.hbase.mapreduce.LoadIncrementalHFiles.doBulkLoad(
> LoadIncrementalHFiles.java:463)
>
> at org.apache.hadoop.hbase.mapreduce.LoadIncrementalHFiles.doBulkLoad(
> LoadIncrementalHFiles.java:373)
>
> at org.apache.phoenix.mapreduce.AbstractBulkLoadTool.completebulkload(
> AbstractBulkLoadTool.java:355)
>
> at org.apache.phoenix.mapreduce.AbstractBulkLoadTool.submitJob(
> AbstractBulkLoadTool.java:332)
>
> at org.apache.phoenix.mapreduce.AbstractBulkLoadTool.loadData(
> AbstractBulkLoadTool.java:270)
>
> at org.apache.phoenix.mapreduce.AbstractBulkLoadTool.run(
> AbstractBulkLoadTool.java:183)
>
> at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)
>
> at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:84)
>
> at org.apache.phoenix.mapreduce.CsvBulkLoadTool.main(
> CsvBulkLoadTool.java:101)
>
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>
> at sun.reflect.NativeMethodAccessorImpl.invoke(
> NativeMethodAccessorImpl.java:62)
>
> at sun.reflect.DelegatingMethodAccessorImpl.invoke(
> DelegatingMethodAccessorImpl.java:43)
>
> at java.lang.reflect.Method.invoke(Method.java:498)
>
> at org.apache.hadoop.util.RunJar.run(RunJar.java:221)
>
> at org.apache.hadoop.util.RunJar.main(RunJar.java:136)
>
> Caused by: org.apache.hadoop.hbase.client.RetriesExhaustedException:
> Failed after attempts=35, exceptions:
>
> Wed Sep 06 20:55:36 UTC 2017, RpcRetryingCaller{globalStartTime=1504731276347,
> pause=100, retries=35}, java.io.IOException: Call to
> ip-10-123-0-60.ec2.internal/10.123.0.60:16020 failed on local exception:
> org.apache.hadoop.hbase.ipc.CallTimeoutException: Call id=9,
> waitTime=60002, operationTimeout=60000 expired.
>
> On Wed, Sep 6, 2017 at 5:01 PM, Sriram Nookala <sr...@firstfuel.com>
> wrote:
>
>> Phoenix 4.11.0, HBase 1.3.1
>>
>> This is what I get from jstack
>>
>> "main" #1 prio=5 os_prio=0 tid=0x00007fb3d0017000 nid=0x5de7 waiting on
>> condition [0x00007fb3d75f7000]
>>
>>    java.lang.Thread.State: WAITING (parking)
>>
>> at sun.misc.Unsafe.park(Native Method)
>>
>> - parking to wait for  <0x00000000f2222588> (a
>> java.util.concurrent.FutureTask)
>>
>> at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175)
>>
>> at java.util.concurrent.FutureTask.awaitDone(FutureTask.java:429)
>>
>> at java.util.concurrent.FutureTask.get(FutureTask.java:191)
>>
>> at org.apache.hadoop.hbase.mapreduce.LoadIncrementalHFiles.bulk
>> LoadPhase(LoadIncrementalHFiles.java:604)
>>
>> at org.apache.hadoop.hbase.mapreduce.LoadIncrementalHFiles.doBu
>> lkLoad(LoadIncrementalHFiles.java:463)
>>
>> at org.apache.hadoop.hbase.mapreduce.LoadIncrementalHFiles.doBu
>> lkLoad(LoadIncrementalHFiles.java:373)
>>
>> at org.apache.phoenix.mapreduce.AbstractBulkLoadTool.completebu
>> lkload(AbstractBulkLoadTool.java:355)
>>
>> at org.apache.phoenix.mapreduce.AbstractBulkLoadTool.submitJob(
>> AbstractBulkLoadTool.java:332)
>>
>> at org.apache.phoenix.mapreduce.AbstractBulkLoadTool.loadData(A
>> bstractBulkLoadTool.java:270)
>>
>> at org.apache.phoenix.mapreduce.AbstractBulkLoadTool.run(Abstra
>> ctBulkLoadTool.java:183)
>>
>> at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)
>>
>> at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:84)
>>
>> at org.apache.phoenix.mapreduce.CsvBulkLoadTool.main(CsvBulkLoa
>> dTool.java:101)
>>
>> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>>
>> at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAcce
>> ssorImpl.java:62)
>>
>> at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMe
>> thodAccessorImpl.java:43)
>>
>> at java.lang.reflect.Method.invoke(Method.java:498)
>>
>> at org.apache.hadoop.util.RunJar.run(RunJar.java:221)
>>
>> at org.apache.hadoop.util.RunJar.main(RunJar.java:136)
>>
>>
>>
>>
>> On Wed, Sep 6, 2017 at 4:16 PM, Sergey Soldatov <sergeysoldatov@gmail.com
>> > wrote:
>>
>>> Do you have more details on the version of Phoenix/HBase you are using
>>> as well as how it hangs (Exceptions/messages that may help to understand
>>> the problem)?
>>>
>>> Thanks,
>>> Sergey
>>>
>>> On Wed, Sep 6, 2017 at 1:13 PM, Sriram Nookala <sr...@firstfuel.com>
>>> wrote:
>>>
>>>> I'm trying to load a 3.5G file with 60 million rows using
>>>> CsvBulkLoadTool. It hangs while loading HFiles. This runs successfully if I
>>>> split this into 2 files, but I'd like to avoid doing that. This is on
>>>> Amazon EMR, is this an issue due to disk space or memory. I have a single
>>>> master and 2 region server configuration with 16 GB memory on each node.
>>>>
>>>
>>>
>>
>

Re: Phoenix CSV Bulk Load fails to load a large file

Posted by Sriram Nookala <sr...@firstfuel.com>.
It finally times out with these exceptions

ed Sep 06 21:38:07 UTC 2017,
RpcRetryingCaller{globalStartTime=1504731276347, pause=100, retries=35},
java.io.IOException: Call to ip-10-123-0-60.ec2.internal/10.123.0.60:16020
failed on local exception:
org.apache.hadoop.hbase.ipc.CallTimeoutException: Call id=77,
waitTime=60001, operationTimeout=60000 expired.


at
org.apache.hadoop.hbase.client.RpcRetryingCaller.callWithRetries(RpcRetryingCaller.java:159)

at
org.apache.hadoop.hbase.mapreduce.LoadIncrementalHFiles.tryAtomicRegionLoad(LoadIncrementalHFiles.java:956)

at
org.apache.hadoop.hbase.mapreduce.LoadIncrementalHFiles$2.call(LoadIncrementalHFiles.java:594)

at
org.apache.hadoop.hbase.mapreduce.LoadIncrementalHFiles$2.call(LoadIncrementalHFiles.java:590)

at java.util.concurrent.FutureTask.run(FutureTask.java:266)

at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)

at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)

at java.lang.Thread.run(Thread.java:745)

Caused by: java.io.IOException: Call to ip-10-123-0-60.ec2.internal/
10.123.0.60:16020 failed on local exception:
org.apache.hadoop.hbase.ipc.CallTimeoutException: Call id=77,
waitTime=60001, operationTimeout=60000 expired.

at
org.apache.hadoop.hbase.ipc.AbstractRpcClient.wrapException(AbstractRpcClient.java:292)

at org.apache.hadoop.hbase.ipc.RpcClientImpl.call(RpcClientImpl.java:1274)

at
org.apache.hadoop.hbase.ipc.AbstractRpcClient.callBlockingMethod(AbstractRpcClient.java:227)

at
org.apache.hadoop.hbase.ipc.AbstractRpcClient$BlockingRpcChannelImplementation.callBlockingMethod(AbstractRpcClient.java:336)

at
org.apache.hadoop.hbase.protobuf.generated.ClientProtos$ClientService$BlockingStub.bulkLoadHFile(ClientProtos.java:35408)

at
org.apache.hadoop.hbase.protobuf.ProtobufUtil.bulkLoadHFile(ProtobufUtil.java:1676)

at
org.apache.hadoop.hbase.mapreduce.LoadIncrementalHFiles$3.call(LoadIncrementalHFiles.java:656)

at
org.apache.hadoop.hbase.mapreduce.LoadIncrementalHFiles$3.call(LoadIncrementalHFiles.java:645)

at
org.apache.hadoop.hbase.client.RpcRetryingCaller.callWithRetries(RpcRetryingCaller.java:137)

... 7 more

Caused by: org.apache.hadoop.hbase.ipc.CallTimeoutException: Call id=77,
waitTime=60001, operationTimeout=60000 expired.

at org.apache.hadoop.hbase.ipc.Call.checkAndSetTimeout(Call.java:73)

at org.apache.hadoop.hbase.ipc.RpcClientImpl.call(RpcClientImpl.java:1248)

... 14 more

17/09/06 21:38:07 ERROR mapreduce.LoadIncrementalHFiles:
hbase.bulkload.retries.retryOnIOException is disabled. Unable to recover

17/09/06 21:38:07 INFO zookeeper.ZooKeeper: Session: 0x15e58ca21fc004c
closed

17/09/06 21:38:07 INFO zookeeper.ClientCnxn: EventThread shut down

Exception in thread "main" java.io.IOException: BulkLoad encountered an
unrecoverable problem

at
org.apache.hadoop.hbase.mapreduce.LoadIncrementalHFiles.bulkLoadPhase(LoadIncrementalHFiles.java:614)

at
org.apache.hadoop.hbase.mapreduce.LoadIncrementalHFiles.doBulkLoad(LoadIncrementalHFiles.java:463)

at
org.apache.hadoop.hbase.mapreduce.LoadIncrementalHFiles.doBulkLoad(LoadIncrementalHFiles.java:373)

at
org.apache.phoenix.mapreduce.AbstractBulkLoadTool.completebulkload(AbstractBulkLoadTool.java:355)

at
org.apache.phoenix.mapreduce.AbstractBulkLoadTool.submitJob(AbstractBulkLoadTool.java:332)

at
org.apache.phoenix.mapreduce.AbstractBulkLoadTool.loadData(AbstractBulkLoadTool.java:270)

at
org.apache.phoenix.mapreduce.AbstractBulkLoadTool.run(AbstractBulkLoadTool.java:183)

at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)

at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:84)

at
org.apache.phoenix.mapreduce.CsvBulkLoadTool.main(CsvBulkLoadTool.java:101)

at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)

at
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)

at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)

at java.lang.reflect.Method.invoke(Method.java:498)

at org.apache.hadoop.util.RunJar.run(RunJar.java:221)

at org.apache.hadoop.util.RunJar.main(RunJar.java:136)

Caused by: org.apache.hadoop.hbase.client.RetriesExhaustedException: Failed
after attempts=35, exceptions:

Wed Sep 06 20:55:36 UTC 2017,
RpcRetryingCaller{globalStartTime=1504731276347, pause=100, retries=35},
java.io.IOException: Call to ip-10-123-0-60.ec2.internal/10.123.0.60:16020
failed on local exception:
org.apache.hadoop.hbase.ipc.CallTimeoutException: Call id=9,
waitTime=60002, operationTimeout=60000 expired.

On Wed, Sep 6, 2017 at 5:01 PM, Sriram Nookala <sr...@firstfuel.com> wrote:

> Phoenix 4.11.0, HBase 1.3.1
>
> This is what I get from jstack
>
> "main" #1 prio=5 os_prio=0 tid=0x00007fb3d0017000 nid=0x5de7 waiting on
> condition [0x00007fb3d75f7000]
>
>    java.lang.Thread.State: WAITING (parking)
>
> at sun.misc.Unsafe.park(Native Method)
>
> - parking to wait for  <0x00000000f2222588> (a java.util.concurrent.
> FutureTask)
>
> at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175)
>
> at java.util.concurrent.FutureTask.awaitDone(FutureTask.java:429)
>
> at java.util.concurrent.FutureTask.get(FutureTask.java:191)
>
> at org.apache.hadoop.hbase.mapreduce.LoadIncrementalHFiles.bulkLoadPhase(
> LoadIncrementalHFiles.java:604)
>
> at org.apache.hadoop.hbase.mapreduce.LoadIncrementalHFiles.doBulkLoad(
> LoadIncrementalHFiles.java:463)
>
> at org.apache.hadoop.hbase.mapreduce.LoadIncrementalHFiles.doBulkLoad(
> LoadIncrementalHFiles.java:373)
>
> at org.apache.phoenix.mapreduce.AbstractBulkLoadTool.completebulkload(
> AbstractBulkLoadTool.java:355)
>
> at org.apache.phoenix.mapreduce.AbstractBulkLoadTool.submitJob(
> AbstractBulkLoadTool.java:332)
>
> at org.apache.phoenix.mapreduce.AbstractBulkLoadTool.loadData(
> AbstractBulkLoadTool.java:270)
>
> at org.apache.phoenix.mapreduce.AbstractBulkLoadTool.run(
> AbstractBulkLoadTool.java:183)
>
> at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)
>
> at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:84)
>
> at org.apache.phoenix.mapreduce.CsvBulkLoadTool.main(
> CsvBulkLoadTool.java:101)
>
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>
> at sun.reflect.NativeMethodAccessorImpl.invoke(
> NativeMethodAccessorImpl.java:62)
>
> at sun.reflect.DelegatingMethodAccessorImpl.invoke(
> DelegatingMethodAccessorImpl.java:43)
>
> at java.lang.reflect.Method.invoke(Method.java:498)
>
> at org.apache.hadoop.util.RunJar.run(RunJar.java:221)
>
> at org.apache.hadoop.util.RunJar.main(RunJar.java:136)
>
>
>
>
> On Wed, Sep 6, 2017 at 4:16 PM, Sergey Soldatov <se...@gmail.com>
> wrote:
>
>> Do you have more details on the version of Phoenix/HBase you are using as
>> well as how it hangs (Exceptions/messages that may help to understand the
>> problem)?
>>
>> Thanks,
>> Sergey
>>
>> On Wed, Sep 6, 2017 at 1:13 PM, Sriram Nookala <sr...@firstfuel.com>
>> wrote:
>>
>>> I'm trying to load a 3.5G file with 60 million rows using
>>> CsvBulkLoadTool. It hangs while loading HFiles. This runs successfully if I
>>> split this into 2 files, but I'd like to avoid doing that. This is on
>>> Amazon EMR, is this an issue due to disk space or memory. I have a single
>>> master and 2 region server configuration with 16 GB memory on each node.
>>>
>>
>>
>

Re: Phoenix CSV Bulk Load fails to load a large file

Posted by Sriram Nookala <sr...@firstfuel.com>.
Phoenix 4.11.0, HBase 1.3.1

This is what I get from jstack

"main" #1 prio=5 os_prio=0 tid=0x00007fb3d0017000 nid=0x5de7 waiting on
condition [0x00007fb3d75f7000]

   java.lang.Thread.State: WAITING (parking)

at sun.misc.Unsafe.park(Native Method)

- parking to wait for  <0x00000000f2222588> (a
java.util.concurrent.FutureTask)

at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175)

at java.util.concurrent.FutureTask.awaitDone(FutureTask.java:429)

at java.util.concurrent.FutureTask.get(FutureTask.java:191)

at
org.apache.hadoop.hbase.mapreduce.LoadIncrementalHFiles.bulkLoadPhase(LoadIncrementalHFiles.java:604)

at
org.apache.hadoop.hbase.mapreduce.LoadIncrementalHFiles.doBulkLoad(LoadIncrementalHFiles.java:463)

at
org.apache.hadoop.hbase.mapreduce.LoadIncrementalHFiles.doBulkLoad(LoadIncrementalHFiles.java:373)

at
org.apache.phoenix.mapreduce.AbstractBulkLoadTool.completebulkload(AbstractBulkLoadTool.java:355)

at
org.apache.phoenix.mapreduce.AbstractBulkLoadTool.submitJob(AbstractBulkLoadTool.java:332)

at
org.apache.phoenix.mapreduce.AbstractBulkLoadTool.loadData(AbstractBulkLoadTool.java:270)

at
org.apache.phoenix.mapreduce.AbstractBulkLoadTool.run(AbstractBulkLoadTool.java:183)

at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)

at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:84)

at
org.apache.phoenix.mapreduce.CsvBulkLoadTool.main(CsvBulkLoadTool.java:101)

at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)

at
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)

at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)

at java.lang.reflect.Method.invoke(Method.java:498)

at org.apache.hadoop.util.RunJar.run(RunJar.java:221)

at org.apache.hadoop.util.RunJar.main(RunJar.java:136)




On Wed, Sep 6, 2017 at 4:16 PM, Sergey Soldatov <se...@gmail.com>
wrote:

> Do you have more details on the version of Phoenix/HBase you are using as
> well as how it hangs (Exceptions/messages that may help to understand the
> problem)?
>
> Thanks,
> Sergey
>
> On Wed, Sep 6, 2017 at 1:13 PM, Sriram Nookala <sr...@firstfuel.com>
> wrote:
>
>> I'm trying to load a 3.5G file with 60 million rows using
>> CsvBulkLoadTool. It hangs while loading HFiles. This runs successfully if I
>> split this into 2 files, but I'd like to avoid doing that. This is on
>> Amazon EMR, is this an issue due to disk space or memory. I have a single
>> master and 2 region server configuration with 16 GB memory on each node.
>>
>
>

Re: Phoenix CSV Bulk Load fails to load a large file

Posted by Sergey Soldatov <se...@gmail.com>.
Do you have more details on the version of Phoenix/HBase you are using as
well as how it hangs (Exceptions/messages that may help to understand the
problem)?

Thanks,
Sergey

On Wed, Sep 6, 2017 at 1:13 PM, Sriram Nookala <sr...@firstfuel.com> wrote:

> I'm trying to load a 3.5G file with 60 million rows using CsvBulkLoadTool.
> It hangs while loading HFiles. This runs successfully if I split this into
> 2 files, but I'd like to avoid doing that. This is on Amazon EMR, is this
> an issue due to disk space or memory. I have a single master and 2 region
> server configuration with 16 GB memory on each node.
>