You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@hbase.apache.org by Mayuresh <ma...@gmail.com> on 2011/09/23 11:52:22 UTC

Fwd: Coprocessor execution - HBaseServer: (responseTooSlow) java.nio.channels.ClosedChannelException

Hi,

I am using the AggregationClient to perform aggregate calculation on a
10 M row hbase table which is spread across around 27 regions.
Currently all the regions are on a single region server.

Here is briefly what the code is doing:
        Leases leases = new Leases(40 * 60 * 1000, 5 * 1000);
        leases.start();
        long startTime = System.currentTimeMillis();
        System.out.println("Start time: " + startTime);
        Configuration conf = new Configuration(true);
        AggregationClient aClient = new AggregationClient(conf);

        Scan scan = new Scan();
        scan.addColumn(TEST_FAMILY, TEST_QUALIFIER);
        final ColumnInterpreter<Long, Long> ci = new LongColumnInterpreter();

        double avg = aClient.avg(TEST_TABLE, ci,
                scan);
        leases.close();
        long endTime = System.currentTimeMillis();
        System.out.println("End time: " + endTime);
        System.out.println("Avg is: " + avg);
        System.out.println("Time taken is: " + (endTime - startTime));

However it is failing after some time with the following log messages
in the region server log:
2011-09-23 14:21:25,649 DEBUG
org.apache.hadoop.hbase.regionserver.HRegion: Updates disabled for
region Test,serv3-1315575650141,1316682342728.b3fe66994b0f2fb0643c88f6897474a1.
2011-09-23 14:21:25,650 DEBUG
org.apache.hadoop.hbase.io.hfile.HFileReaderV2: On close of file
15761056997240226 evicted 344 block(s)
2011-09-23 14:21:25,650 DEBUG
org.apache.hadoop.hbase.regionserver.Store: closed data
2011-09-23 14:21:25,650 INFO
org.apache.hadoop.hbase.regionserver.HRegion: Closed
Test,serv3-1315575650141,1316682342728.b3fe66994b0f2fb0643c88f6897474a1.
2011-09-23 14:21:25,650 DEBUG
org.apache.hadoop.hbase.regionserver.handler.CloseRegionHandler:
Closed region Test,serv3-1315575650141,1316682342728.b3fe66994b0f2fb0643c88f6897474a1.
2011-09-23 14:21:25,651 WARN org.apache.hadoop.ipc.HBaseServer: IPC
Server Responder, call execCoprocessor([B@6b6ab732,
getAvg(org.apache.hadoop.hbase.client.coprocessor.LongColumnInterpreter@2b216ab6,
{"timeRange":[0,9223372036854775807],"batch":-1,"startRow":"serv1-1315571539741","stopRow":"serv5-1315575834916","totalColumns":0,"cacheBlocks":true,"families":{"data":[]},"maxVersions":1,"caching":-1}))
from 137.72.240.177:47563: output error
2011-09-23 14:21:25,651 WARN org.apache.hadoop.ipc.HBaseServer: IPC
Server handler 9 on 60020 caught:
java.nio.channels.ClosedChannelException
        at sun.nio.ch.SocketChannelImpl.ensureWriteOpen(SocketChannelImpl.java:133)
        at sun.nio.ch.SocketChannelImpl.write(SocketChannelImpl.java:324)
        at org.apache.hadoop.hbase.ipc.HBaseServer.channelWrite(HBaseServer.java:1501)
        at org.apache.hadoop.hbase.ipc.HBaseServer$Responder.processResponse(HBaseServer.java:876)
        at org.apache.hadoop.hbase.ipc.HBaseServer$Responder.doRespond(HBaseServer.java:955)
        at org.apache.hadoop.hbase.ipc.HBaseServer$Call.sendResponseIfReady(HBaseServer.java:390)
        at org.apache.hadoop.hbase.ipc.HBaseServer$Handler.run(HBaseServer.java:1240)

And the following in the zookeeper log:
2011-09-23 14:19:50,978 INFO
org.apache.zookeeper.server.NIOServerCnxn: Established session
0x1329577cad70004 with negotiated timeout 180000 for client
/137.72.240.177:45431
2011-09-23 14:21:11,028 WARN
org.apache.zookeeper.server.NIOServerCnxn: EndOfStreamException:
Unable to read additional data from client sessionid
0x1329577cad70004, likely client has closed socket
2011-09-23 14:21:11,029 INFO
org.apache.zookeeper.server.NIOServerCnxn: Closed socket connection
for client /137.72.240.177:45431 which had sessionid 0x1329577cad70004
2011-09-23 14:24:42,000 INFO
org.apache.zookeeper.server.ZooKeeperServer: Expiring session
0x1329577cad70004, timeout of 180000ms exceeded
2011-09-23 14:24:42,001 INFO
org.apache.zookeeper.server.PrepRequestProcessor: Processed session
termination for sessionid: 0x1329577cad70004

Any ideas on how to resolve this?

Thanks in advance.

Regards,
Mayuresh

Re: Coprocessor execution - HBaseServer: (responseTooSlow) java.nio.channels.ClosedChannelException

Posted by Stack <st...@duboce.net>.
responseTooSlow is a new feature in TRUNK.  Slow responses get logged
with detail.  I suppose your long-running CP qualifies.  We should
make it so there can be exceptions.   The ClosedChannelException
usually means the client has given up waiting, it timedout.   Try
setting "ipc.socket.timeout" to something large.

St.Ack

On Mon, Sep 26, 2011 at 3:22 AM, Mayuresh <ma...@gmail.com> wrote:
> I increased the leases in hbase-site.xml to around 50 minutes:
>
>  <property>
>    <name>hbase.regionserver.lease.period</name>
>    <!--value>60000</value-->
>    <value>3000000</value>
>    <description>HRegion server lease period in milliseconds. Default is
>    60 seconds. Clients must report in within this period else they are
>    considered dead.</description>
>  </property>
>
> However I still fail with the same error:
>
> 2011-09-26 15:50:28,857 WARN org.apache.hadoop.ipc.HBaseServer:
> (responseTooSlow):
> {"processingtimems":118696,"call":"execCoprocessor([B@5b0e6f59,
> getAvg(org.apache.hadoop.hbase.client.coprocessor.LongColumnInterpreter@10b06ac3,
> {\"timeRange\":[0,9223372036854775807],\"batch\":-1,\"startRow\":\"\",\"stopRow\":\"\",\"totalColumns\":0,\"cacheBlocks\":true,\"families\":{\"data\":[]},\"maxVersions\":1,\"caching\":-1}))","client":"137.72.240.180:54843","starttimems":1317032310155,"queuetimems":9,"class":"HRegionServer","responsesize":0,"method":"execCoprocessor"}
> 2011-09-26 15:50:28,872 WARN org.apache.hadoop.ipc.HBaseServer: IPC
> Server Responder, call execCoprocessor([B@5b0e6f59,
> getAvg(org.apache.hadoop.hbase.client.coprocessor.LongColumnInterpreter@10b06ac3,
> {"timeRange":[0,9223372036854775807],"batch":-1,"startRow":"","stopRow":"","totalColumns":0,"cacheBlocks":true,"families":{"data":[]},"maxVersions":1,"caching":-1}))
> from 137.72.240.180:54843: output error
> 2011-09-26 15:50:28,873 WARN org.apache.hadoop.ipc.HBaseServer: IPC
> Server handler 4 on 60020 caught:
> java.nio.channels.ClosedChannelException
>        at sun.nio.ch.SocketChannelImpl.ensureWriteOpen(SocketChannelImpl.java:133)
>        at sun.nio.ch.SocketChannelImpl.write(SocketChannelImpl.java:324)
>        at org.apache.hadoop.hbase.ipc.HBaseServer.channelWrite(HBaseServer.java:1501)
>        at org.apache.hadoop.hbase.ipc.HBaseServer$Responder.processResponse(HBaseServer.java:876)
>        at org.apache.hadoop.hbase.ipc.HBaseServer$Responder.doRespond(HBaseServer.java:955)
>        at org.apache.hadoop.hbase.ipc.HBaseServer$Call.sendResponseIfReady(HBaseServer.java:390)
>        at org.apache.hadoop.hbase.ipc.HBaseServer$Handler.run(HBaseServer.java:1240)
>
> What does this responseTooSlow and "output error" mean? Any clues to
> find whats causing this?
>
> Thanks.
>
>
> On Mon, Sep 26, 2011 at 2:26 PM, Mayuresh <ma...@gmail.com> wrote:
>> Hi Ted,
>>
>> Yes I am aware that this isn't a good setup. I am working on
>> understanding the coprocessors before I can use a bigger setup.
>>
>> I am talking about the prez at
>> https://hbase.s3.amazonaws.com/hbase/HBase-CP-HUG10.pdf by Andrew
>> Purtell, slide 5 :
>>
>> <quote>
>> For long running jobs the client must periodically poll status to keep it
>> alive; jobs without interest will be cancelled
>> </quote>
>>
>>
>> On Mon, Sep 26, 2011 at 2:08 PM, Ted Yu <yu...@gmail.com> wrote:
>>> Looks like my response to user@ got bounced.
>>>
>>> Himanshu is continuing the work on HBASE-3607
>>>
>>>>> Currently all the regions are on a single region server.
>>> I don't think the above is a good setup.
>>>
>>>>> It was mentioned to poll the job status periodically to avoid timeouts.
>>> Can you tell us more about the presentation ? Normally you can include the
>>> author in the cc list of this email chain.
>>>
>>> Cheers
>>>
>>> On Mon, Sep 26, 2011 at 12:44 AM, Mayuresh <ma...@gmail.com>
>>> wrote:
>>>>
>>>> Yes the computation takes longer than 60 secs. Will setting lease
>>>> timeout in hbase-site.xml help?
>>>>
>>>> In one of the coprocessor presentations. It was mentioned to poll the
>>>> job status periodically to avoid timeouts. Any idea how to do that?
>>>>
>>>> On Sep 23, 2011 9:59 PM, "Stack" <st...@duboce.net> wrote:
>>>> > Is it doing a long GC? Is the aggregation taking longer than the
>>>> > client 60 second timeout (You are setting lease below but I don't
>>>> > think that will have an effect since it server-side)?
>>>> >
>>>> > St.Ack
>>>> >
>>>> >
>>>> > On Fri, Sep 23, 2011 at 2:52 AM, Mayuresh
>>>> > <ma...@gmail.com> wrote:
>>>> >> Hi,
>>>> >>
>>>> >> I am using the AggregationClient to perform aggregate calculation on a
>>>> >> 10 M row hbase table which is spread across around 27 regions.
>>>> >> Currently all the regions are on a single region server.
>>>> >>
>>>> >> Here is briefly what the code is doing:
>>>> >>         Leases leases = new Leases(40 * 60 * 1000, 5 * 1000);
>>>> >>         leases.start();
>>>> >>         long startTime = System.currentTimeMillis();
>>>> >>         System.out.println("Start time: " + startTime);
>>>> >>         Configuration conf = new Configuration(true);
>>>> >>         AggregationClient aClient = new AggregationClient(conf);
>>>> >>
>>>> >>         Scan scan = new Scan();
>>>> >>         scan.addColumn(TEST_FAMILY, TEST_QUALIFIER);
>>>> >>         final ColumnInterpreter<Long, Long> ci = new
>>>> >> LongColumnInterpreter();
>>>> >>
>>>> >>         double avg = aClient.avg(TEST_TABLE, ci,
>>>> >>                 scan);
>>>> >>         leases.close();
>>>> >>         long endTime = System.currentTimeMillis();
>>>> >>         System.out.println("End time: " + endTime);
>>>> >>         System.out.println("Avg is: " + avg);
>>>> >>         System.out.println("Time taken is: " + (endTime - startTime));
>>>> >>
>>>> >> However it is failing after some time with the following log messages
>>>> >> in the region server log:
>>>> >> 2011-09-23 14:21:25,649 DEBUG
>>>> >> org.apache.hadoop.hbase.regionserver.HRegion: Updates disabled for
>>>> >> region
>>>> >> Test,serv3-1315575650141,1316682342728.b3fe66994b0f2fb0643c88f6897474a1.
>>>> >> 2011-09-23 14:21:25,650 DEBUG
>>>> >> org.apache.hadoop.hbase.io.hfile.HFileReaderV2: On close of file
>>>> >> 15761056997240226 evicted 344 block(s)
>>>> >> 2011-09-23 14:21:25,650 DEBUG
>>>> >> org.apache.hadoop.hbase.regionserver.Store: closed data
>>>> >> 2011-09-23 14:21:25,650 INFO
>>>> >> org.apache.hadoop.hbase.regionserver.HRegion: Closed
>>>> >>
>>>> >> Test,serv3-1315575650141,1316682342728.b3fe66994b0f2fb0643c88f6897474a1.
>>>> >> 2011-09-23 14:21:25,650 DEBUG
>>>> >> org.apache.hadoop.hbase.regionserver.handler.CloseRegionHandler:
>>>> >> Closed region
>>>> >> Test,serv3-1315575650141,1316682342728.b3fe66994b0f2fb0643c88f6897474a1.
>>>> >> 2011-09-23 14:21:25,651 WARN org.apache.hadoop.ipc.HBaseServer: IPC
>>>> >> Server Responder, call execCoprocessor([B@6b6ab732,
>>>> >>
>>>> >> getAvg(org.apache.hadoop.hbase.client.coprocessor.LongColumnInterpreter@2b216ab6,
>>>> >>
>>>> >> {"timeRange":[0,9223372036854775807],"batch":-1,"startRow":"serv1-1315571539741","stopRow":"serv5-1315575834916","totalColumns":0,"cacheBlocks":true,"families":{"data":[]},"maxVersions":1,"caching":-1}))
>>>> >> from 137.72.240.177:47563: output error
>>>> >> 2011-09-23 14:21:25,651 WARN org.apache.hadoop.ipc.HBaseServer: IPC
>>>> >> Server handler 9 on 60020 caught:
>>>> >> java.nio.channels.ClosedChannelException
>>>> >>         at
>>>> >> sun.nio.ch.SocketChannelImpl.ensureWriteOpen(SocketChannelImpl.java:133)
>>>> >>         at
>>>> >> sun.nio.ch.SocketChannelImpl.write(SocketChannelImpl.java:324)
>>>> >>         at
>>>> >> org.apache.hadoop.hbase.ipc.HBaseServer.channelWrite(HBaseServer.java:1501)
>>>> >>         at
>>>> >> org.apache.hadoop.hbase.ipc.HBaseServer$Responder.processResponse(HBaseServer.java:876)
>>>> >>         at
>>>> >> org.apache.hadoop.hbase.ipc.HBaseServer$Responder.doRespond(HBaseServer.java:955)
>>>> >>         at
>>>> >> org.apache.hadoop.hbase.ipc.HBaseServer$Call.sendResponseIfReady(HBaseServer.java:390)
>>>> >>         at
>>>> >> org.apache.hadoop.hbase.ipc.HBaseServer$Handler.run(HBaseServer.java:1240)
>>>> >>
>>>> >> And the following in the zookeeper log:
>>>> >> 2011-09-23 14:19:50,978 INFO
>>>> >> org.apache.zookeeper.server.NIOServerCnxn: Established session
>>>> >> 0x1329577cad70004 with negotiated timeout 180000 for client
>>>> >> /137.72.240.177:45431
>>>> >> 2011-09-23 14:21:11,028 WARN
>>>> >> org.apache.zookeeper.server.NIOServerCnxn: EndOfStreamException:
>>>> >> Unable to read additional data from client sessionid
>>>> >> 0x1329577cad70004, likely client has closed socket
>>>> >> 2011-09-23 14:21:11,029 INFO
>>>> >> org.apache.zookeeper.server.NIOServerCnxn: Closed socket connection
>>>> >> for client /137.72.240.177:45431 which had sessionid 0x1329577cad70004
>>>> >> 2011-09-23 14:24:42,000 INFO
>>>> >> org.apache.zookeeper.server.ZooKeeperServer: Expiring session
>>>> >> 0x1329577cad70004, timeout of 180000ms exceeded
>>>> >> 2011-09-23 14:24:42,001 INFO
>>>> >> org.apache.zookeeper.server.PrepRequestProcessor: Processed session
>>>> >> termination for sessionid: 0x1329577cad70004
>>>> >>
>>>> >> Any ideas on how to resolve this?
>>>> >>
>>>> >> Thanks in advance.
>>>> >>
>>>> >> Regards,
>>>> >> Mayuresh
>>>> >>
>>>
>>>
>>
>>
>>
>> --
>> -Mayuresh
>>
>
>
>
> --
> -Mayuresh
>

Re: Coprocessor execution - HBaseServer: (responseTooSlow) java.nio.channels.ClosedChannelException

Posted by Himanshu Vashishtha <hv...@cs.ualberta.ca>.
Hey Mayuresh!
Can you try setting the Scan cache value to a good number (like a  few
100 or more)?

Cheers,
Himanshu


On Mon, Sep 26, 2011 at 9:01 AM, Himanshu Vashishtha
<hv...@cs.ualberta.ca> wrote:
>
> Hey Mayuresh!
> Can you try setting the Scan cache value to a good number (like a  few 100 or more)?
>
> Cheers,
> Himanshu
>
> On Mon, Sep 26, 2011 at 8:41 AM, Mayuresh <ma...@gmail.com> wrote:
>>
>> Guys any guidance? Am not getting any clue onto how to solve this problem.
>>
>> On Mon, Sep 26, 2011 at 3:52 PM, Mayuresh <ma...@gmail.com> wrote:
>> > I increased the leases in hbase-site.xml to around 50 minutes:
>> >
>> >  <property>
>> >    <name>hbase.regionserver.lease.period</name>
>> >    <!--value>60000</value-->
>> >    <value>3000000</value>
>> >    <description>HRegion server lease period in milliseconds. Default is
>> >    60 seconds. Clients must report in within this period else they are
>> >    considered dead.</description>
>> >  </property>
>> >
>> > However I still fail with the same error:
>> >
>> > 2011-09-26 15:50:28,857 WARN org.apache.hadoop.ipc.HBaseServer:
>> > (responseTooSlow):
>> > {"processingtimems":118696,"call":"execCoprocessor([B@5b0e6f59,
>> > getAvg(org.apache.hadoop.hbase.client.coprocessor.LongColumnInterpreter@10b06ac3,
>> > {\"timeRange\":[0,9223372036854775807],\"batch\":-1,\"startRow\":\"\",\"stopRow\":\"\",\"totalColumns\":0,\"cacheBlocks\":true,\"families\":{\"data\":[]},\"maxVersions\":1,\"caching\":-1}))","client":"137.72.240.180:54843","starttimems":1317032310155,"queuetimems":9,"class":"HRegionServer","responsesize":0,"method":"execCoprocessor"}
>> > 2011-09-26 15:50:28,872 WARN org.apache.hadoop.ipc.HBaseServer: IPC
>> > Server Responder, call execCoprocessor([B@5b0e6f59,
>> > getAvg(org.apache.hadoop.hbase.client.coprocessor.LongColumnInterpreter@10b06ac3,
>> > {"timeRange":[0,9223372036854775807],"batch":-1,"startRow":"","stopRow":"","totalColumns":0,"cacheBlocks":true,"families":{"data":[]},"maxVersions":1,"caching":-1}))
>> > from 137.72.240.180:54843: output error
>> > 2011-09-26 15:50:28,873 WARN org.apache.hadoop.ipc.HBaseServer: IPC
>> > Server handler 4 on 60020 caught:
>> > java.nio.channels.ClosedChannelException
>> >        at sun.nio.ch.SocketChannelImpl.ensureWriteOpen(SocketChannelImpl.java:133)
>> >        at sun.nio.ch.SocketChannelImpl.write(SocketChannelImpl.java:324)
>> >        at org.apache.hadoop.hbase.ipc.HBaseServer.channelWrite(HBaseServer.java:1501)
>> >        at org.apache.hadoop.hbase.ipc.HBaseServer$Responder.processResponse(HBaseServer.java:876)
>> >        at org.apache.hadoop.hbase.ipc.HBaseServer$Responder.doRespond(HBaseServer.java:955)
>> >        at org.apache.hadoop.hbase.ipc.HBaseServer$Call.sendResponseIfReady(HBaseServer.java:390)
>> >        at org.apache.hadoop.hbase.ipc.HBaseServer$Handler.run(HBaseServer.java:1240)
>> >
>> > What does this responseTooSlow and "output error" mean? Any clues to
>> > find whats causing this?
>> >
>> > Thanks.
>> >
>> >
>> > On Mon, Sep 26, 2011 at 2:26 PM, Mayuresh <ma...@gmail.com> wrote:
>> >> Hi Ted,
>> >>
>> >> Yes I am aware that this isn't a good setup. I am working on
>> >> understanding the coprocessors before I can use a bigger setup.
>> >>
>> >> I am talking about the prez at
>> >> https://hbase.s3.amazonaws.com/hbase/HBase-CP-HUG10.pdf by Andrew
>> >> Purtell, slide 5 :
>> >>
>> >> <quote>
>> >> For long running jobs the client must periodically poll status to keep it
>> >> alive; jobs without interest will be cancelled
>> >> </quote>
>> >>
>> >>
>> >> On Mon, Sep 26, 2011 at 2:08 PM, Ted Yu <yu...@gmail.com> wrote:
>> >>> Looks like my response to user@ got bounced.
>> >>>
>> >>> Himanshu is continuing the work on HBASE-3607
>> >>>
>> >>>>> Currently all the regions are on a single region server.
>> >>> I don't think the above is a good setup.
>> >>>
>> >>>>> It was mentioned to poll the job status periodically to avoid timeouts.
>> >>> Can you tell us more about the presentation ? Normally you can include the
>> >>> author in the cc list of this email chain.
>> >>>
>> >>> Cheers
>> >>>
>> >>> On Mon, Sep 26, 2011 at 12:44 AM, Mayuresh <ma...@gmail.com>
>> >>> wrote:
>> >>>>
>> >>>> Yes the computation takes longer than 60 secs. Will setting lease
>> >>>> timeout in hbase-site.xml help?
>> >>>>
>> >>>> In one of the coprocessor presentations. It was mentioned to poll the
>> >>>> job status periodically to avoid timeouts. Any idea how to do that?
>> >>>>
>> >>>> On Sep 23, 2011 9:59 PM, "Stack" <st...@duboce.net> wrote:
>> >>>> > Is it doing a long GC? Is the aggregation taking longer than the
>> >>>> > client 60 second timeout (You are setting lease below but I don't
>> >>>> > think that will have an effect since it server-side)?
>> >>>> >
>> >>>> > St.Ack
>> >>>> >
>> >>>> >
>> >>>> > On Fri, Sep 23, 2011 at 2:52 AM, Mayuresh
>> >>>> > <ma...@gmail.com> wrote:
>> >>>> >> Hi,
>> >>>> >>
>> >>>> >> I am using the AggregationClient to perform aggregate calculation on a
>> >>>> >> 10 M row hbase table which is spread across around 27 regions.
>> >>>> >> Currently all the regions are on a single region server.
>> >>>> >>
>> >>>> >> Here is briefly what the code is doing:
>> >>>> >>         Leases leases = new Leases(40 * 60 * 1000, 5 * 1000);
>> >>>> >>         leases.start();
>> >>>> >>         long startTime = System.currentTimeMillis();
>> >>>> >>         System.out.println("Start time: " + startTime);
>> >>>> >>         Configuration conf = new Configuration(true);
>> >>>> >>         AggregationClient aClient = new AggregationClient(conf);
>> >>>> >>
>> >>>> >>         Scan scan = new Scan();
>> >>>> >>         scan.addColumn(TEST_FAMILY, TEST_QUALIFIER);
>> >>>> >>         final ColumnInterpreter<Long, Long> ci = new
>> >>>> >> LongColumnInterpreter();
>> >>>> >>
>> >>>> >>         double avg = aClient.avg(TEST_TABLE, ci,
>> >>>> >>                 scan);
>> >>>> >>         leases.close();
>> >>>> >>         long endTime = System.currentTimeMillis();
>> >>>> >>         System.out.println("End time: " + endTime);
>> >>>> >>         System.out.println("Avg is: " + avg);
>> >>>> >>         System.out.println("Time taken is: " + (endTime - startTime));
>> >>>> >>
>> >>>> >> However it is failing after some time with the following log messages
>> >>>> >> in the region server log:
>> >>>> >> 2011-09-23 14:21:25,649 DEBUG
>> >>>> >> org.apache.hadoop.hbase.regionserver.HRegion: Updates disabled for
>> >>>> >> region
>> >>>> >> Test,serv3-1315575650141,1316682342728.b3fe66994b0f2fb0643c88f6897474a1.
>> >>>> >> 2011-09-23 14:21:25,650 DEBUG
>> >>>> >> org.apache.hadoop.hbase.io.hfile.HFileReaderV2: On close of file
>> >>>> >> 15761056997240226 evicted 344 block(s)
>> >>>> >> 2011-09-23 14:21:25,650 DEBUG
>> >>>> >> org.apache.hadoop.hbase.regionserver.Store: closed data
>> >>>> >> 2011-09-23 14:21:25,650 INFO
>> >>>> >> org.apache.hadoop.hbase.regionserver.HRegion: Closed
>> >>>> >>
>> >>>> >> Test,serv3-1315575650141,1316682342728.b3fe66994b0f2fb0643c88f6897474a1.
>> >>>> >> 2011-09-23 14:21:25,650 DEBUG
>> >>>> >> org.apache.hadoop.hbase.regionserver.handler.CloseRegionHandler:
>> >>>> >> Closed region
>> >>>> >> Test,serv3-1315575650141,1316682342728.b3fe66994b0f2fb0643c88f6897474a1.
>> >>>> >> 2011-09-23 14:21:25,651 WARN org.apache.hadoop.ipc.HBaseServer: IPC
>> >>>> >> Server Responder, call execCoprocessor([B@6b6ab732,
>> >>>> >>
>> >>>> >> getAvg(org.apache.hadoop.hbase.client.coprocessor.LongColumnInterpreter@2b216ab6,
>> >>>> >>
>> >>>> >> {"timeRange":[0,9223372036854775807],"batch":-1,"startRow":"serv1-1315571539741","stopRow":"serv5-1315575834916","totalColumns":0,"cacheBlocks":true,"families":{"data":[]},"maxVersions":1,"caching":-1}))
>> >>>> >> from 137.72.240.177:47563: output error
>> >>>> >> 2011-09-23 14:21:25,651 WARN org.apache.hadoop.ipc.HBaseServer: IPC
>> >>>> >> Server handler 9 on 60020 caught:
>> >>>> >> java.nio.channels.ClosedChannelException
>> >>>> >>         at
>> >>>> >> sun.nio.ch.SocketChannelImpl.ensureWriteOpen(SocketChannelImpl.java:133)
>> >>>> >>         at
>> >>>> >> sun.nio.ch.SocketChannelImpl.write(SocketChannelImpl.java:324)
>> >>>> >>         at
>> >>>> >> org.apache.hadoop.hbase.ipc.HBaseServer.channelWrite(HBaseServer.java:1501)
>> >>>> >>         at
>> >>>> >> org.apache.hadoop.hbase.ipc.HBaseServer$Responder.processResponse(HBaseServer.java:876)
>> >>>> >>         at
>> >>>> >> org.apache.hadoop.hbase.ipc.HBaseServer$Responder.doRespond(HBaseServer.java:955)
>> >>>> >>         at
>> >>>> >> org.apache.hadoop.hbase.ipc.HBaseServer$Call.sendResponseIfReady(HBaseServer.java:390)
>> >>>> >>         at
>> >>>> >> org.apache.hadoop.hbase.ipc.HBaseServer$Handler.run(HBaseServer.java:1240)
>> >>>> >>
>> >>>> >> And the following in the zookeeper log:
>> >>>> >> 2011-09-23 14:19:50,978 INFO
>> >>>> >> org.apache.zookeeper.server.NIOServerCnxn: Established session
>> >>>> >> 0x1329577cad70004 with negotiated timeout 180000 for client
>> >>>> >> /137.72.240.177:45431
>> >>>> >> 2011-09-23 14:21:11,028 WARN
>> >>>> >> org.apache.zookeeper.server.NIOServerCnxn: EndOfStreamException:
>> >>>> >> Unable to read additional data from client sessionid
>> >>>> >> 0x1329577cad70004, likely client has closed socket
>> >>>> >> 2011-09-23 14:21:11,029 INFO
>> >>>> >> org.apache.zookeeper.server.NIOServerCnxn: Closed socket connection
>> >>>> >> for client /137.72.240.177:45431 which had sessionid 0x1329577cad70004
>> >>>> >> 2011-09-23 14:24:42,000 INFO
>> >>>> >> org.apache.zookeeper.server.ZooKeeperServer: Expiring session
>> >>>> >> 0x1329577cad70004, timeout of 180000ms exceeded
>> >>>> >> 2011-09-23 14:24:42,001 INFO
>> >>>> >> org.apache.zookeeper.server.PrepRequestProcessor: Processed session
>> >>>> >> termination for sessionid: 0x1329577cad70004
>> >>>> >>
>> >>>> >> Any ideas on how to resolve this?
>> >>>> >>
>> >>>> >> Thanks in advance.
>> >>>> >>
>> >>>> >> Regards,
>> >>>> >> Mayuresh
>> >>>> >>
>> >>>
>> >>>
>> >>
>> >>
>> >>
>> >> --
>> >> -Mayuresh
>> >>
>> >
>> >
>> >
>> > --
>> > -Mayuresh
>> >
>>
>>
>>
>> --
>> -Mayuresh
>

Re: Coprocessor execution - HBaseServer: (responseTooSlow) java.nio.channels.ClosedChannelException

Posted by Mayuresh <ma...@gmail.com>.
Thanks everyone for guiding me on this!

Setting hbase.rpc.timeout played the trick!

Thanks again everyone for the guidance!

On Mon, Sep 26, 2011 at 11:22 PM, Mayuresh
<ma...@gmail.com> wrote:
> Thanks Gary! That did the trick. Its now working just fine!!
>
> Regards,
> Mayuresh
>
> On Mon, Sep 26, 2011 at 9:50 PM, Mayuresh <ma...@gmail.com> wrote:
>> Thanks Gary! I'll try that out.
>>
>> Rearding bouncing mails, I could get over it by setting the mail
>> format to plain text.,
>>
>> Thanks again!
>>
>> On Mon, Sep 26, 2011 at 8:37 PM, Gary Helmling <gh...@gmail.com> wrote:
>>> Mayuresh,
>>>
>>> Apologies for replying off list.  The user mailing list seems to be bouncing
>>> all of my messages as spam.
>>>
>>> See my response below on the call timeout.
>>>
>>> Gary
>>>
>>>
>>> ---------- Forwarded message ----------
>>> From: Gary Helmling <gh...@gmail.com>
>>> Date: Mon, Sep 26, 2011 at 7:57 AM
>>> Subject: Re: Coprocessor execution - HBaseServer: (responseTooSlow)
>>> java.nio.channels.ClosedChannelException
>>> To: user@hbase.apache.org
>>>
>>>
>>>
>>>> > However I still fail with the same error:
>>>> >
>>>> > 2011-09-26 15:50:28,857 WARN org.apache.hadoop.ipc.HBaseServer:
>>>> > (responseTooSlow):
>>>> > {"processingtimems":118696,"call":"execCoprocessor([B@5b0e6f59,
>>>> >
>>>> > getAvg(org.apache.hadoop.hbase.client.coprocessor.LongColumnInterpreter@10b06ac3,
>>>> >
>>>> > {\"timeRange\":[0,9223372036854775807],\"batch\":-1,\"startRow\":\"\",\"stopRow\":\"\",\"totalColumns\":0,\"cacheBlocks\":true,\"families\":{\"data\":[]},\"maxVersions\":1,\"caching\":-1}))","client":"137.72.240.180:54843","starttimems":1317032310155,"queuetimems":9,"class":"HRegionServer","responsesize":0,"method":"execCoprocessor"}
>>>
>>> This shows that the request took almost 2 minutes to process (118696
>>> milliseconds).
>>>
>>> I believe this is the client-side RPC timeout (default 60 seconds) from
>>> HBASE-3154 kicking in.  Try adding this to the hbase-site.xml you are using
>>> on the clients:
>>>
>>> <property>
>>>   <name>hbase.rpc.timeout</name>
>>>   <value>300000</value>
>>> </property>
>>>
>>> This will up the client side RPC timeout to 5 minutes.  You can increase it
>>> more if you think necessary or even disable it completely by setting to 0.
>>> Note if you disable it though, and the server-side handler somehow gets hung
>>> up (which would be a problem in itself), the client could keep waiting
>>> indefinitely on the response.
>>>
>>>
>>
>>
>>
>> --
>> -Mayuresh
>>
>
>
>
> --
> -Mayuresh
>



-- 
-Mayuresh

Re: Coprocessor execution - HBaseServer: (responseTooSlow) java.nio.channels.ClosedChannelException

Posted by Mayuresh <ma...@gmail.com>.
Guys any guidance? Am not getting any clue onto how to solve this problem.

On Mon, Sep 26, 2011 at 3:52 PM, Mayuresh <ma...@gmail.com> wrote:
> I increased the leases in hbase-site.xml to around 50 minutes:
>
>  <property>
>    <name>hbase.regionserver.lease.period</name>
>    <!--value>60000</value-->
>    <value>3000000</value>
>    <description>HRegion server lease period in milliseconds. Default is
>    60 seconds. Clients must report in within this period else they are
>    considered dead.</description>
>  </property>
>
> However I still fail with the same error:
>
> 2011-09-26 15:50:28,857 WARN org.apache.hadoop.ipc.HBaseServer:
> (responseTooSlow):
> {"processingtimems":118696,"call":"execCoprocessor([B@5b0e6f59,
> getAvg(org.apache.hadoop.hbase.client.coprocessor.LongColumnInterpreter@10b06ac3,
> {\"timeRange\":[0,9223372036854775807],\"batch\":-1,\"startRow\":\"\",\"stopRow\":\"\",\"totalColumns\":0,\"cacheBlocks\":true,\"families\":{\"data\":[]},\"maxVersions\":1,\"caching\":-1}))","client":"137.72.240.180:54843","starttimems":1317032310155,"queuetimems":9,"class":"HRegionServer","responsesize":0,"method":"execCoprocessor"}
> 2011-09-26 15:50:28,872 WARN org.apache.hadoop.ipc.HBaseServer: IPC
> Server Responder, call execCoprocessor([B@5b0e6f59,
> getAvg(org.apache.hadoop.hbase.client.coprocessor.LongColumnInterpreter@10b06ac3,
> {"timeRange":[0,9223372036854775807],"batch":-1,"startRow":"","stopRow":"","totalColumns":0,"cacheBlocks":true,"families":{"data":[]},"maxVersions":1,"caching":-1}))
> from 137.72.240.180:54843: output error
> 2011-09-26 15:50:28,873 WARN org.apache.hadoop.ipc.HBaseServer: IPC
> Server handler 4 on 60020 caught:
> java.nio.channels.ClosedChannelException
>        at sun.nio.ch.SocketChannelImpl.ensureWriteOpen(SocketChannelImpl.java:133)
>        at sun.nio.ch.SocketChannelImpl.write(SocketChannelImpl.java:324)
>        at org.apache.hadoop.hbase.ipc.HBaseServer.channelWrite(HBaseServer.java:1501)
>        at org.apache.hadoop.hbase.ipc.HBaseServer$Responder.processResponse(HBaseServer.java:876)
>        at org.apache.hadoop.hbase.ipc.HBaseServer$Responder.doRespond(HBaseServer.java:955)
>        at org.apache.hadoop.hbase.ipc.HBaseServer$Call.sendResponseIfReady(HBaseServer.java:390)
>        at org.apache.hadoop.hbase.ipc.HBaseServer$Handler.run(HBaseServer.java:1240)
>
> What does this responseTooSlow and "output error" mean? Any clues to
> find whats causing this?
>
> Thanks.
>
>
> On Mon, Sep 26, 2011 at 2:26 PM, Mayuresh <ma...@gmail.com> wrote:
>> Hi Ted,
>>
>> Yes I am aware that this isn't a good setup. I am working on
>> understanding the coprocessors before I can use a bigger setup.
>>
>> I am talking about the prez at
>> https://hbase.s3.amazonaws.com/hbase/HBase-CP-HUG10.pdf by Andrew
>> Purtell, slide 5 :
>>
>> <quote>
>> For long running jobs the client must periodically poll status to keep it
>> alive; jobs without interest will be cancelled
>> </quote>
>>
>>
>> On Mon, Sep 26, 2011 at 2:08 PM, Ted Yu <yu...@gmail.com> wrote:
>>> Looks like my response to user@ got bounced.
>>>
>>> Himanshu is continuing the work on HBASE-3607
>>>
>>>>> Currently all the regions are on a single region server.
>>> I don't think the above is a good setup.
>>>
>>>>> It was mentioned to poll the job status periodically to avoid timeouts.
>>> Can you tell us more about the presentation ? Normally you can include the
>>> author in the cc list of this email chain.
>>>
>>> Cheers
>>>
>>> On Mon, Sep 26, 2011 at 12:44 AM, Mayuresh <ma...@gmail.com>
>>> wrote:
>>>>
>>>> Yes the computation takes longer than 60 secs. Will setting lease
>>>> timeout in hbase-site.xml help?
>>>>
>>>> In one of the coprocessor presentations. It was mentioned to poll the
>>>> job status periodically to avoid timeouts. Any idea how to do that?
>>>>
>>>> On Sep 23, 2011 9:59 PM, "Stack" <st...@duboce.net> wrote:
>>>> > Is it doing a long GC? Is the aggregation taking longer than the
>>>> > client 60 second timeout (You are setting lease below but I don't
>>>> > think that will have an effect since it server-side)?
>>>> >
>>>> > St.Ack
>>>> >
>>>> >
>>>> > On Fri, Sep 23, 2011 at 2:52 AM, Mayuresh
>>>> > <ma...@gmail.com> wrote:
>>>> >> Hi,
>>>> >>
>>>> >> I am using the AggregationClient to perform aggregate calculation on a
>>>> >> 10 M row hbase table which is spread across around 27 regions.
>>>> >> Currently all the regions are on a single region server.
>>>> >>
>>>> >> Here is briefly what the code is doing:
>>>> >>         Leases leases = new Leases(40 * 60 * 1000, 5 * 1000);
>>>> >>         leases.start();
>>>> >>         long startTime = System.currentTimeMillis();
>>>> >>         System.out.println("Start time: " + startTime);
>>>> >>         Configuration conf = new Configuration(true);
>>>> >>         AggregationClient aClient = new AggregationClient(conf);
>>>> >>
>>>> >>         Scan scan = new Scan();
>>>> >>         scan.addColumn(TEST_FAMILY, TEST_QUALIFIER);
>>>> >>         final ColumnInterpreter<Long, Long> ci = new
>>>> >> LongColumnInterpreter();
>>>> >>
>>>> >>         double avg = aClient.avg(TEST_TABLE, ci,
>>>> >>                 scan);
>>>> >>         leases.close();
>>>> >>         long endTime = System.currentTimeMillis();
>>>> >>         System.out.println("End time: " + endTime);
>>>> >>         System.out.println("Avg is: " + avg);
>>>> >>         System.out.println("Time taken is: " + (endTime - startTime));
>>>> >>
>>>> >> However it is failing after some time with the following log messages
>>>> >> in the region server log:
>>>> >> 2011-09-23 14:21:25,649 DEBUG
>>>> >> org.apache.hadoop.hbase.regionserver.HRegion: Updates disabled for
>>>> >> region
>>>> >> Test,serv3-1315575650141,1316682342728.b3fe66994b0f2fb0643c88f6897474a1.
>>>> >> 2011-09-23 14:21:25,650 DEBUG
>>>> >> org.apache.hadoop.hbase.io.hfile.HFileReaderV2: On close of file
>>>> >> 15761056997240226 evicted 344 block(s)
>>>> >> 2011-09-23 14:21:25,650 DEBUG
>>>> >> org.apache.hadoop.hbase.regionserver.Store: closed data
>>>> >> 2011-09-23 14:21:25,650 INFO
>>>> >> org.apache.hadoop.hbase.regionserver.HRegion: Closed
>>>> >>
>>>> >> Test,serv3-1315575650141,1316682342728.b3fe66994b0f2fb0643c88f6897474a1.
>>>> >> 2011-09-23 14:21:25,650 DEBUG
>>>> >> org.apache.hadoop.hbase.regionserver.handler.CloseRegionHandler:
>>>> >> Closed region
>>>> >> Test,serv3-1315575650141,1316682342728.b3fe66994b0f2fb0643c88f6897474a1.
>>>> >> 2011-09-23 14:21:25,651 WARN org.apache.hadoop.ipc.HBaseServer: IPC
>>>> >> Server Responder, call execCoprocessor([B@6b6ab732,
>>>> >>
>>>> >> getAvg(org.apache.hadoop.hbase.client.coprocessor.LongColumnInterpreter@2b216ab6,
>>>> >>
>>>> >> {"timeRange":[0,9223372036854775807],"batch":-1,"startRow":"serv1-1315571539741","stopRow":"serv5-1315575834916","totalColumns":0,"cacheBlocks":true,"families":{"data":[]},"maxVersions":1,"caching":-1}))
>>>> >> from 137.72.240.177:47563: output error
>>>> >> 2011-09-23 14:21:25,651 WARN org.apache.hadoop.ipc.HBaseServer: IPC
>>>> >> Server handler 9 on 60020 caught:
>>>> >> java.nio.channels.ClosedChannelException
>>>> >>         at
>>>> >> sun.nio.ch.SocketChannelImpl.ensureWriteOpen(SocketChannelImpl.java:133)
>>>> >>         at
>>>> >> sun.nio.ch.SocketChannelImpl.write(SocketChannelImpl.java:324)
>>>> >>         at
>>>> >> org.apache.hadoop.hbase.ipc.HBaseServer.channelWrite(HBaseServer.java:1501)
>>>> >>         at
>>>> >> org.apache.hadoop.hbase.ipc.HBaseServer$Responder.processResponse(HBaseServer.java:876)
>>>> >>         at
>>>> >> org.apache.hadoop.hbase.ipc.HBaseServer$Responder.doRespond(HBaseServer.java:955)
>>>> >>         at
>>>> >> org.apache.hadoop.hbase.ipc.HBaseServer$Call.sendResponseIfReady(HBaseServer.java:390)
>>>> >>         at
>>>> >> org.apache.hadoop.hbase.ipc.HBaseServer$Handler.run(HBaseServer.java:1240)
>>>> >>
>>>> >> And the following in the zookeeper log:
>>>> >> 2011-09-23 14:19:50,978 INFO
>>>> >> org.apache.zookeeper.server.NIOServerCnxn: Established session
>>>> >> 0x1329577cad70004 with negotiated timeout 180000 for client
>>>> >> /137.72.240.177:45431
>>>> >> 2011-09-23 14:21:11,028 WARN
>>>> >> org.apache.zookeeper.server.NIOServerCnxn: EndOfStreamException:
>>>> >> Unable to read additional data from client sessionid
>>>> >> 0x1329577cad70004, likely client has closed socket
>>>> >> 2011-09-23 14:21:11,029 INFO
>>>> >> org.apache.zookeeper.server.NIOServerCnxn: Closed socket connection
>>>> >> for client /137.72.240.177:45431 which had sessionid 0x1329577cad70004
>>>> >> 2011-09-23 14:24:42,000 INFO
>>>> >> org.apache.zookeeper.server.ZooKeeperServer: Expiring session
>>>> >> 0x1329577cad70004, timeout of 180000ms exceeded
>>>> >> 2011-09-23 14:24:42,001 INFO
>>>> >> org.apache.zookeeper.server.PrepRequestProcessor: Processed session
>>>> >> termination for sessionid: 0x1329577cad70004
>>>> >>
>>>> >> Any ideas on how to resolve this?
>>>> >>
>>>> >> Thanks in advance.
>>>> >>
>>>> >> Regards,
>>>> >> Mayuresh
>>>> >>
>>>
>>>
>>
>>
>>
>> --
>> -Mayuresh
>>
>
>
>
> --
> -Mayuresh
>



-- 
-Mayuresh

Re: Coprocessor execution - HBaseServer: (responseTooSlow) java.nio.channels.ClosedChannelException

Posted by Mayuresh <ma...@gmail.com>.
I increased the leases in hbase-site.xml to around 50 minutes:

  <property>
    <name>hbase.regionserver.lease.period</name>
    <!--value>60000</value-->
    <value>3000000</value>
    <description>HRegion server lease period in milliseconds. Default is
    60 seconds. Clients must report in within this period else they are
    considered dead.</description>
  </property>

However I still fail with the same error:

2011-09-26 15:50:28,857 WARN org.apache.hadoop.ipc.HBaseServer:
(responseTooSlow):
{"processingtimems":118696,"call":"execCoprocessor([B@5b0e6f59,
getAvg(org.apache.hadoop.hbase.client.coprocessor.LongColumnInterpreter@10b06ac3,
{\"timeRange\":[0,9223372036854775807],\"batch\":-1,\"startRow\":\"\",\"stopRow\":\"\",\"totalColumns\":0,\"cacheBlocks\":true,\"families\":{\"data\":[]},\"maxVersions\":1,\"caching\":-1}))","client":"137.72.240.180:54843","starttimems":1317032310155,"queuetimems":9,"class":"HRegionServer","responsesize":0,"method":"execCoprocessor"}
2011-09-26 15:50:28,872 WARN org.apache.hadoop.ipc.HBaseServer: IPC
Server Responder, call execCoprocessor([B@5b0e6f59,
getAvg(org.apache.hadoop.hbase.client.coprocessor.LongColumnInterpreter@10b06ac3,
{"timeRange":[0,9223372036854775807],"batch":-1,"startRow":"","stopRow":"","totalColumns":0,"cacheBlocks":true,"families":{"data":[]},"maxVersions":1,"caching":-1}))
from 137.72.240.180:54843: output error
2011-09-26 15:50:28,873 WARN org.apache.hadoop.ipc.HBaseServer: IPC
Server handler 4 on 60020 caught:
java.nio.channels.ClosedChannelException
	at sun.nio.ch.SocketChannelImpl.ensureWriteOpen(SocketChannelImpl.java:133)
	at sun.nio.ch.SocketChannelImpl.write(SocketChannelImpl.java:324)
	at org.apache.hadoop.hbase.ipc.HBaseServer.channelWrite(HBaseServer.java:1501)
	at org.apache.hadoop.hbase.ipc.HBaseServer$Responder.processResponse(HBaseServer.java:876)
	at org.apache.hadoop.hbase.ipc.HBaseServer$Responder.doRespond(HBaseServer.java:955)
	at org.apache.hadoop.hbase.ipc.HBaseServer$Call.sendResponseIfReady(HBaseServer.java:390)
	at org.apache.hadoop.hbase.ipc.HBaseServer$Handler.run(HBaseServer.java:1240)

What does this responseTooSlow and "output error" mean? Any clues to
find whats causing this?

Thanks.


On Mon, Sep 26, 2011 at 2:26 PM, Mayuresh <ma...@gmail.com> wrote:
> Hi Ted,
>
> Yes I am aware that this isn't a good setup. I am working on
> understanding the coprocessors before I can use a bigger setup.
>
> I am talking about the prez at
> https://hbase.s3.amazonaws.com/hbase/HBase-CP-HUG10.pdf by Andrew
> Purtell, slide 5 :
>
> <quote>
> For long running jobs the client must periodically poll status to keep it
> alive; jobs without interest will be cancelled
> </quote>
>
>
> On Mon, Sep 26, 2011 at 2:08 PM, Ted Yu <yu...@gmail.com> wrote:
>> Looks like my response to user@ got bounced.
>>
>> Himanshu is continuing the work on HBASE-3607
>>
>>>> Currently all the regions are on a single region server.
>> I don't think the above is a good setup.
>>
>>>> It was mentioned to poll the job status periodically to avoid timeouts.
>> Can you tell us more about the presentation ? Normally you can include the
>> author in the cc list of this email chain.
>>
>> Cheers
>>
>> On Mon, Sep 26, 2011 at 12:44 AM, Mayuresh <ma...@gmail.com>
>> wrote:
>>>
>>> Yes the computation takes longer than 60 secs. Will setting lease
>>> timeout in hbase-site.xml help?
>>>
>>> In one of the coprocessor presentations. It was mentioned to poll the
>>> job status periodically to avoid timeouts. Any idea how to do that?
>>>
>>> On Sep 23, 2011 9:59 PM, "Stack" <st...@duboce.net> wrote:
>>> > Is it doing a long GC? Is the aggregation taking longer than the
>>> > client 60 second timeout (You are setting lease below but I don't
>>> > think that will have an effect since it server-side)?
>>> >
>>> > St.Ack
>>> >
>>> >
>>> > On Fri, Sep 23, 2011 at 2:52 AM, Mayuresh
>>> > <ma...@gmail.com> wrote:
>>> >> Hi,
>>> >>
>>> >> I am using the AggregationClient to perform aggregate calculation on a
>>> >> 10 M row hbase table which is spread across around 27 regions.
>>> >> Currently all the regions are on a single region server.
>>> >>
>>> >> Here is briefly what the code is doing:
>>> >>         Leases leases = new Leases(40 * 60 * 1000, 5 * 1000);
>>> >>         leases.start();
>>> >>         long startTime = System.currentTimeMillis();
>>> >>         System.out.println("Start time: " + startTime);
>>> >>         Configuration conf = new Configuration(true);
>>> >>         AggregationClient aClient = new AggregationClient(conf);
>>> >>
>>> >>         Scan scan = new Scan();
>>> >>         scan.addColumn(TEST_FAMILY, TEST_QUALIFIER);
>>> >>         final ColumnInterpreter<Long, Long> ci = new
>>> >> LongColumnInterpreter();
>>> >>
>>> >>         double avg = aClient.avg(TEST_TABLE, ci,
>>> >>                 scan);
>>> >>         leases.close();
>>> >>         long endTime = System.currentTimeMillis();
>>> >>         System.out.println("End time: " + endTime);
>>> >>         System.out.println("Avg is: " + avg);
>>> >>         System.out.println("Time taken is: " + (endTime - startTime));
>>> >>
>>> >> However it is failing after some time with the following log messages
>>> >> in the region server log:
>>> >> 2011-09-23 14:21:25,649 DEBUG
>>> >> org.apache.hadoop.hbase.regionserver.HRegion: Updates disabled for
>>> >> region
>>> >> Test,serv3-1315575650141,1316682342728.b3fe66994b0f2fb0643c88f6897474a1.
>>> >> 2011-09-23 14:21:25,650 DEBUG
>>> >> org.apache.hadoop.hbase.io.hfile.HFileReaderV2: On close of file
>>> >> 15761056997240226 evicted 344 block(s)
>>> >> 2011-09-23 14:21:25,650 DEBUG
>>> >> org.apache.hadoop.hbase.regionserver.Store: closed data
>>> >> 2011-09-23 14:21:25,650 INFO
>>> >> org.apache.hadoop.hbase.regionserver.HRegion: Closed
>>> >>
>>> >> Test,serv3-1315575650141,1316682342728.b3fe66994b0f2fb0643c88f6897474a1.
>>> >> 2011-09-23 14:21:25,650 DEBUG
>>> >> org.apache.hadoop.hbase.regionserver.handler.CloseRegionHandler:
>>> >> Closed region
>>> >> Test,serv3-1315575650141,1316682342728.b3fe66994b0f2fb0643c88f6897474a1.
>>> >> 2011-09-23 14:21:25,651 WARN org.apache.hadoop.ipc.HBaseServer: IPC
>>> >> Server Responder, call execCoprocessor([B@6b6ab732,
>>> >>
>>> >> getAvg(org.apache.hadoop.hbase.client.coprocessor.LongColumnInterpreter@2b216ab6,
>>> >>
>>> >> {"timeRange":[0,9223372036854775807],"batch":-1,"startRow":"serv1-1315571539741","stopRow":"serv5-1315575834916","totalColumns":0,"cacheBlocks":true,"families":{"data":[]},"maxVersions":1,"caching":-1}))
>>> >> from 137.72.240.177:47563: output error
>>> >> 2011-09-23 14:21:25,651 WARN org.apache.hadoop.ipc.HBaseServer: IPC
>>> >> Server handler 9 on 60020 caught:
>>> >> java.nio.channels.ClosedChannelException
>>> >>         at
>>> >> sun.nio.ch.SocketChannelImpl.ensureWriteOpen(SocketChannelImpl.java:133)
>>> >>         at
>>> >> sun.nio.ch.SocketChannelImpl.write(SocketChannelImpl.java:324)
>>> >>         at
>>> >> org.apache.hadoop.hbase.ipc.HBaseServer.channelWrite(HBaseServer.java:1501)
>>> >>         at
>>> >> org.apache.hadoop.hbase.ipc.HBaseServer$Responder.processResponse(HBaseServer.java:876)
>>> >>         at
>>> >> org.apache.hadoop.hbase.ipc.HBaseServer$Responder.doRespond(HBaseServer.java:955)
>>> >>         at
>>> >> org.apache.hadoop.hbase.ipc.HBaseServer$Call.sendResponseIfReady(HBaseServer.java:390)
>>> >>         at
>>> >> org.apache.hadoop.hbase.ipc.HBaseServer$Handler.run(HBaseServer.java:1240)
>>> >>
>>> >> And the following in the zookeeper log:
>>> >> 2011-09-23 14:19:50,978 INFO
>>> >> org.apache.zookeeper.server.NIOServerCnxn: Established session
>>> >> 0x1329577cad70004 with negotiated timeout 180000 for client
>>> >> /137.72.240.177:45431
>>> >> 2011-09-23 14:21:11,028 WARN
>>> >> org.apache.zookeeper.server.NIOServerCnxn: EndOfStreamException:
>>> >> Unable to read additional data from client sessionid
>>> >> 0x1329577cad70004, likely client has closed socket
>>> >> 2011-09-23 14:21:11,029 INFO
>>> >> org.apache.zookeeper.server.NIOServerCnxn: Closed socket connection
>>> >> for client /137.72.240.177:45431 which had sessionid 0x1329577cad70004
>>> >> 2011-09-23 14:24:42,000 INFO
>>> >> org.apache.zookeeper.server.ZooKeeperServer: Expiring session
>>> >> 0x1329577cad70004, timeout of 180000ms exceeded
>>> >> 2011-09-23 14:24:42,001 INFO
>>> >> org.apache.zookeeper.server.PrepRequestProcessor: Processed session
>>> >> termination for sessionid: 0x1329577cad70004
>>> >>
>>> >> Any ideas on how to resolve this?
>>> >>
>>> >> Thanks in advance.
>>> >>
>>> >> Regards,
>>> >> Mayuresh
>>> >>
>>
>>
>
>
>
> --
> -Mayuresh
>



-- 
-Mayuresh

Re: Coprocessor execution - HBaseServer: (responseTooSlow) java.nio.channels.ClosedChannelException

Posted by Mayuresh <ma...@gmail.com>.
Hi Ted,

Yes I am aware that this isn't a good setup. I am working on
understanding the coprocessors before I can use a bigger setup.

I am talking about the prez at
https://hbase.s3.amazonaws.com/hbase/HBase-CP-HUG10.pdf by Andrew
Purtell, slide 5 :

<quote>
For long running jobs the client must periodically poll status to keep it
alive; jobs without interest will be cancelled
</quote>


On Mon, Sep 26, 2011 at 2:08 PM, Ted Yu <yu...@gmail.com> wrote:
> Looks like my response to user@ got bounced.
>
> Himanshu is continuing the work on HBASE-3607
>
>>> Currently all the regions are on a single region server.
> I don't think the above is a good setup.
>
>>> It was mentioned to poll the job status periodically to avoid timeouts.
> Can you tell us more about the presentation ? Normally you can include the
> author in the cc list of this email chain.
>
> Cheers
>
> On Mon, Sep 26, 2011 at 12:44 AM, Mayuresh <ma...@gmail.com>
> wrote:
>>
>> Yes the computation takes longer than 60 secs. Will setting lease
>> timeout in hbase-site.xml help?
>>
>> In one of the coprocessor presentations. It was mentioned to poll the
>> job status periodically to avoid timeouts. Any idea how to do that?
>>
>> On Sep 23, 2011 9:59 PM, "Stack" <st...@duboce.net> wrote:
>> > Is it doing a long GC? Is the aggregation taking longer than the
>> > client 60 second timeout (You are setting lease below but I don't
>> > think that will have an effect since it server-side)?
>> >
>> > St.Ack
>> >
>> >
>> > On Fri, Sep 23, 2011 at 2:52 AM, Mayuresh
>> > <ma...@gmail.com> wrote:
>> >> Hi,
>> >>
>> >> I am using the AggregationClient to perform aggregate calculation on a
>> >> 10 M row hbase table which is spread across around 27 regions.
>> >> Currently all the regions are on a single region server.
>> >>
>> >> Here is briefly what the code is doing:
>> >>         Leases leases = new Leases(40 * 60 * 1000, 5 * 1000);
>> >>         leases.start();
>> >>         long startTime = System.currentTimeMillis();
>> >>         System.out.println("Start time: " + startTime);
>> >>         Configuration conf = new Configuration(true);
>> >>         AggregationClient aClient = new AggregationClient(conf);
>> >>
>> >>         Scan scan = new Scan();
>> >>         scan.addColumn(TEST_FAMILY, TEST_QUALIFIER);
>> >>         final ColumnInterpreter<Long, Long> ci = new
>> >> LongColumnInterpreter();
>> >>
>> >>         double avg = aClient.avg(TEST_TABLE, ci,
>> >>                 scan);
>> >>         leases.close();
>> >>         long endTime = System.currentTimeMillis();
>> >>         System.out.println("End time: " + endTime);
>> >>         System.out.println("Avg is: " + avg);
>> >>         System.out.println("Time taken is: " + (endTime - startTime));
>> >>
>> >> However it is failing after some time with the following log messages
>> >> in the region server log:
>> >> 2011-09-23 14:21:25,649 DEBUG
>> >> org.apache.hadoop.hbase.regionserver.HRegion: Updates disabled for
>> >> region
>> >> Test,serv3-1315575650141,1316682342728.b3fe66994b0f2fb0643c88f6897474a1.
>> >> 2011-09-23 14:21:25,650 DEBUG
>> >> org.apache.hadoop.hbase.io.hfile.HFileReaderV2: On close of file
>> >> 15761056997240226 evicted 344 block(s)
>> >> 2011-09-23 14:21:25,650 DEBUG
>> >> org.apache.hadoop.hbase.regionserver.Store: closed data
>> >> 2011-09-23 14:21:25,650 INFO
>> >> org.apache.hadoop.hbase.regionserver.HRegion: Closed
>> >>
>> >> Test,serv3-1315575650141,1316682342728.b3fe66994b0f2fb0643c88f6897474a1.
>> >> 2011-09-23 14:21:25,650 DEBUG
>> >> org.apache.hadoop.hbase.regionserver.handler.CloseRegionHandler:
>> >> Closed region
>> >> Test,serv3-1315575650141,1316682342728.b3fe66994b0f2fb0643c88f6897474a1.
>> >> 2011-09-23 14:21:25,651 WARN org.apache.hadoop.ipc.HBaseServer: IPC
>> >> Server Responder, call execCoprocessor([B@6b6ab732,
>> >>
>> >> getAvg(org.apache.hadoop.hbase.client.coprocessor.LongColumnInterpreter@2b216ab6,
>> >>
>> >> {"timeRange":[0,9223372036854775807],"batch":-1,"startRow":"serv1-1315571539741","stopRow":"serv5-1315575834916","totalColumns":0,"cacheBlocks":true,"families":{"data":[]},"maxVersions":1,"caching":-1}))
>> >> from 137.72.240.177:47563: output error
>> >> 2011-09-23 14:21:25,651 WARN org.apache.hadoop.ipc.HBaseServer: IPC
>> >> Server handler 9 on 60020 caught:
>> >> java.nio.channels.ClosedChannelException
>> >>         at
>> >> sun.nio.ch.SocketChannelImpl.ensureWriteOpen(SocketChannelImpl.java:133)
>> >>         at
>> >> sun.nio.ch.SocketChannelImpl.write(SocketChannelImpl.java:324)
>> >>         at
>> >> org.apache.hadoop.hbase.ipc.HBaseServer.channelWrite(HBaseServer.java:1501)
>> >>         at
>> >> org.apache.hadoop.hbase.ipc.HBaseServer$Responder.processResponse(HBaseServer.java:876)
>> >>         at
>> >> org.apache.hadoop.hbase.ipc.HBaseServer$Responder.doRespond(HBaseServer.java:955)
>> >>         at
>> >> org.apache.hadoop.hbase.ipc.HBaseServer$Call.sendResponseIfReady(HBaseServer.java:390)
>> >>         at
>> >> org.apache.hadoop.hbase.ipc.HBaseServer$Handler.run(HBaseServer.java:1240)
>> >>
>> >> And the following in the zookeeper log:
>> >> 2011-09-23 14:19:50,978 INFO
>> >> org.apache.zookeeper.server.NIOServerCnxn: Established session
>> >> 0x1329577cad70004 with negotiated timeout 180000 for client
>> >> /137.72.240.177:45431
>> >> 2011-09-23 14:21:11,028 WARN
>> >> org.apache.zookeeper.server.NIOServerCnxn: EndOfStreamException:
>> >> Unable to read additional data from client sessionid
>> >> 0x1329577cad70004, likely client has closed socket
>> >> 2011-09-23 14:21:11,029 INFO
>> >> org.apache.zookeeper.server.NIOServerCnxn: Closed socket connection
>> >> for client /137.72.240.177:45431 which had sessionid 0x1329577cad70004
>> >> 2011-09-23 14:24:42,000 INFO
>> >> org.apache.zookeeper.server.ZooKeeperServer: Expiring session
>> >> 0x1329577cad70004, timeout of 180000ms exceeded
>> >> 2011-09-23 14:24:42,001 INFO
>> >> org.apache.zookeeper.server.PrepRequestProcessor: Processed session
>> >> termination for sessionid: 0x1329577cad70004
>> >>
>> >> Any ideas on how to resolve this?
>> >>
>> >> Thanks in advance.
>> >>
>> >> Regards,
>> >> Mayuresh
>> >>
>
>



-- 
-Mayuresh

Fwd: Coprocessor execution - HBaseServer: (responseTooSlow) java.nio.channels.ClosedChannelException

Posted by Mayuresh <ma...@gmail.com>.
Yes the computation takes longer than 60 secs. Will setting lease
timeout in hbase-site.xml help?

In one of the coprocessor presentations. It was mentioned to poll the
job status periodically to avoid timeouts. Any idea how to do that?

On Sep 23, 2011 9:59 PM, "Stack" <st...@duboce.net> wrote:
> Is it doing a long GC? Is the aggregation taking longer than the
> client 60 second timeout (You are setting lease below but I don't
> think that will have an effect since it server-side)?
>
> St.Ack
>
>
> On Fri, Sep 23, 2011 at 2:52 AM, Mayuresh <ma...@gmail.com> wrote:
>> Hi,
>>
>> I am using the AggregationClient to perform aggregate calculation on a
>> 10 M row hbase table which is spread across around 27 regions.
>> Currently all the regions are on a single region server.
>>
>> Here is briefly what the code is doing:
>>         Leases leases = new Leases(40 * 60 * 1000, 5 * 1000);
>>         leases.start();
>>         long startTime = System.currentTimeMillis();
>>         System.out.println("Start time: " + startTime);
>>         Configuration conf = new Configuration(true);
>>         AggregationClient aClient = new AggregationClient(conf);
>>
>>         Scan scan = new Scan();
>>         scan.addColumn(TEST_FAMILY, TEST_QUALIFIER);
>>         final ColumnInterpreter<Long, Long> ci = new LongColumnInterpreter();
>>
>>         double avg = aClient.avg(TEST_TABLE, ci,
>>                 scan);
>>         leases.close();
>>         long endTime = System.currentTimeMillis();
>>         System.out.println("End time: " + endTime);
>>         System.out.println("Avg is: " + avg);
>>         System.out.println("Time taken is: " + (endTime - startTime));
>>
>> However it is failing after some time with the following log messages
>> in the region server log:
>> 2011-09-23 14:21:25,649 DEBUG
>> org.apache.hadoop.hbase.regionserver.HRegion: Updates disabled for
>> region Test,serv3-1315575650141,1316682342728.b3fe66994b0f2fb0643c88f6897474a1.
>> 2011-09-23 14:21:25,650 DEBUG
>> org.apache.hadoop.hbase.io.hfile.HFileReaderV2: On close of file
>> 15761056997240226 evicted 344 block(s)
>> 2011-09-23 14:21:25,650 DEBUG
>> org.apache.hadoop.hbase.regionserver.Store: closed data
>> 2011-09-23 14:21:25,650 INFO
>> org.apache.hadoop.hbase.regionserver.HRegion: Closed
>> Test,serv3-1315575650141,1316682342728.b3fe66994b0f2fb0643c88f6897474a1.
>> 2011-09-23 14:21:25,650 DEBUG
>> org.apache.hadoop.hbase.regionserver.handler.CloseRegionHandler:
>> Closed region Test,serv3-1315575650141,1316682342728.b3fe66994b0f2fb0643c88f6897474a1.
>> 2011-09-23 14:21:25,651 WARN org.apache.hadoop.ipc.HBaseServer: IPC
>> Server Responder, call execCoprocessor([B@6b6ab732,
>> getAvg(org.apache.hadoop.hbase.client.coprocessor.LongColumnInterpreter@2b216ab6,
>> {"timeRange":[0,9223372036854775807],"batch":-1,"startRow":"serv1-1315571539741","stopRow":"serv5-1315575834916","totalColumns":0,"cacheBlocks":true,"families":{"data":[]},"maxVersions":1,"caching":-1}))
>> from 137.72.240.177:47563: output error
>> 2011-09-23 14:21:25,651 WARN org.apache.hadoop.ipc.HBaseServer: IPC
>> Server handler 9 on 60020 caught:
>> java.nio.channels.ClosedChannelException
>>         at sun.nio.ch.SocketChannelImpl.ensureWriteOpen(SocketChannelImpl.java:133)
>>         at sun.nio.ch.SocketChannelImpl.write(SocketChannelImpl.java:324)
>>         at org.apache.hadoop.hbase.ipc.HBaseServer.channelWrite(HBaseServer.java:1501)
>>         at org.apache.hadoop.hbase.ipc.HBaseServer$Responder.processResponse(HBaseServer.java:876)
>>         at org.apache.hadoop.hbase.ipc.HBaseServer$Responder.doRespond(HBaseServer.java:955)
>>         at org.apache.hadoop.hbase.ipc.HBaseServer$Call.sendResponseIfReady(HBaseServer.java:390)
>>         at org.apache.hadoop.hbase.ipc.HBaseServer$Handler.run(HBaseServer.java:1240)
>>
>> And the following in the zookeeper log:
>> 2011-09-23 14:19:50,978 INFO
>> org.apache.zookeeper.server.NIOServerCnxn: Established session
>> 0x1329577cad70004 with negotiated timeout 180000 for client
>> /137.72.240.177:45431
>> 2011-09-23 14:21:11,028 WARN
>> org.apache.zookeeper.server.NIOServerCnxn: EndOfStreamException:
>> Unable to read additional data from client sessionid
>> 0x1329577cad70004, likely client has closed socket
>> 2011-09-23 14:21:11,029 INFO
>> org.apache.zookeeper.server.NIOServerCnxn: Closed socket connection
>> for client /137.72.240.177:45431 which had sessionid 0x1329577cad70004
>> 2011-09-23 14:24:42,000 INFO
>> org.apache.zookeeper.server.ZooKeeperServer: Expiring session
>> 0x1329577cad70004, timeout of 180000ms exceeded
>> 2011-09-23 14:24:42,001 INFO
>> org.apache.zookeeper.server.PrepRequestProcessor: Processed session
>> termination for sessionid: 0x1329577cad70004
>>
>> Any ideas on how to resolve this?
>>
>> Thanks in advance.
>>
>> Regards,
>> Mayuresh
>>

Re: Coprocessor execution - HBaseServer: (responseTooSlow) java.nio.channels.ClosedChannelException

Posted by Stack <st...@duboce.net>.
Is it doing a long GC?  Is the aggregation taking longer than the
client 60 second timeout (You are setting lease below but I don't
think that will have an effect since it server-side)?

St.Ack


On Fri, Sep 23, 2011 at 2:52 AM, Mayuresh <ma...@gmail.com> wrote:
> Hi,
>
> I am using the AggregationClient to perform aggregate calculation on a
> 10 M row hbase table which is spread across around 27 regions.
> Currently all the regions are on a single region server.
>
> Here is briefly what the code is doing:
>         Leases leases = new Leases(40 * 60 * 1000, 5 * 1000);
>         leases.start();
>         long startTime = System.currentTimeMillis();
>         System.out.println("Start time: " + startTime);
>         Configuration conf = new Configuration(true);
>         AggregationClient aClient = new AggregationClient(conf);
>
>         Scan scan = new Scan();
>         scan.addColumn(TEST_FAMILY, TEST_QUALIFIER);
>         final ColumnInterpreter<Long, Long> ci = new LongColumnInterpreter();
>
>         double avg = aClient.avg(TEST_TABLE, ci,
>                 scan);
>         leases.close();
>         long endTime = System.currentTimeMillis();
>         System.out.println("End time: " + endTime);
>         System.out.println("Avg is: " + avg);
>         System.out.println("Time taken is: " + (endTime - startTime));
>
> However it is failing after some time with the following log messages
> in the region server log:
> 2011-09-23 14:21:25,649 DEBUG
> org.apache.hadoop.hbase.regionserver.HRegion: Updates disabled for
> region Test,serv3-1315575650141,1316682342728.b3fe66994b0f2fb0643c88f6897474a1.
> 2011-09-23 14:21:25,650 DEBUG
> org.apache.hadoop.hbase.io.hfile.HFileReaderV2: On close of file
> 15761056997240226 evicted 344 block(s)
> 2011-09-23 14:21:25,650 DEBUG
> org.apache.hadoop.hbase.regionserver.Store: closed data
> 2011-09-23 14:21:25,650 INFO
> org.apache.hadoop.hbase.regionserver.HRegion: Closed
> Test,serv3-1315575650141,1316682342728.b3fe66994b0f2fb0643c88f6897474a1.
> 2011-09-23 14:21:25,650 DEBUG
> org.apache.hadoop.hbase.regionserver.handler.CloseRegionHandler:
> Closed region Test,serv3-1315575650141,1316682342728.b3fe66994b0f2fb0643c88f6897474a1.
> 2011-09-23 14:21:25,651 WARN org.apache.hadoop.ipc.HBaseServer: IPC
> Server Responder, call execCoprocessor([B@6b6ab732,
> getAvg(org.apache.hadoop.hbase.client.coprocessor.LongColumnInterpreter@2b216ab6,
> {"timeRange":[0,9223372036854775807],"batch":-1,"startRow":"serv1-1315571539741","stopRow":"serv5-1315575834916","totalColumns":0,"cacheBlocks":true,"families":{"data":[]},"maxVersions":1,"caching":-1}))
> from 137.72.240.177:47563: output error
> 2011-09-23 14:21:25,651 WARN org.apache.hadoop.ipc.HBaseServer: IPC
> Server handler 9 on 60020 caught:
> java.nio.channels.ClosedChannelException
>         at sun.nio.ch.SocketChannelImpl.ensureWriteOpen(SocketChannelImpl.java:133)
>         at sun.nio.ch.SocketChannelImpl.write(SocketChannelImpl.java:324)
>         at org.apache.hadoop.hbase.ipc.HBaseServer.channelWrite(HBaseServer.java:1501)
>         at org.apache.hadoop.hbase.ipc.HBaseServer$Responder.processResponse(HBaseServer.java:876)
>         at org.apache.hadoop.hbase.ipc.HBaseServer$Responder.doRespond(HBaseServer.java:955)
>         at org.apache.hadoop.hbase.ipc.HBaseServer$Call.sendResponseIfReady(HBaseServer.java:390)
>         at org.apache.hadoop.hbase.ipc.HBaseServer$Handler.run(HBaseServer.java:1240)
>
> And the following in the zookeeper log:
> 2011-09-23 14:19:50,978 INFO
> org.apache.zookeeper.server.NIOServerCnxn: Established session
> 0x1329577cad70004 with negotiated timeout 180000 for client
> /137.72.240.177:45431
> 2011-09-23 14:21:11,028 WARN
> org.apache.zookeeper.server.NIOServerCnxn: EndOfStreamException:
> Unable to read additional data from client sessionid
> 0x1329577cad70004, likely client has closed socket
> 2011-09-23 14:21:11,029 INFO
> org.apache.zookeeper.server.NIOServerCnxn: Closed socket connection
> for client /137.72.240.177:45431 which had sessionid 0x1329577cad70004
> 2011-09-23 14:24:42,000 INFO
> org.apache.zookeeper.server.ZooKeeperServer: Expiring session
> 0x1329577cad70004, timeout of 180000ms exceeded
> 2011-09-23 14:24:42,001 INFO
> org.apache.zookeeper.server.PrepRequestProcessor: Processed session
> termination for sessionid: 0x1329577cad70004
>
> Any ideas on how to resolve this?
>
> Thanks in advance.
>
> Regards,
> Mayuresh
>