You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@hadoop.apache.org by Haitao Yao <ya...@gmail.com> on 2012/12/04 08:09:28 UTC

Socket timeout for BlockReaderLocal

hi, all
	I's using Hadoop 1.2.0 , java version "1.7.0_05"
	When running my pig script ,  the worker always report this error, and the MR jobs run very slow. 
	Increase the dfs.socket.timeout value does not work. the network is ok, telnet to 50020 port is always ok.
	here's the stacktrace: 
2012-12-04 14:29:41,323 INFO org.apache.hadoop.hdfs.DFSClient: Failed to read blk_-2337696885631113108_11054058 on local machinejava.net.SocketTimeoutException: Call to /10.130.110.80:50020 failed on socket timeout exception: java.net.SocketTimeoutException: 10000 millis timeout while waiting for channel to be ready for read. ch : java.nio.channels.SocketChannel[connected local=/10.130.110.80:57689 remote=/10.130.110.80:50020]
	at org.apache.hadoop.ipc.Client.wrapException(Client.java:1140)
	at org.apache.hadoop.ipc.Client.call(Client.java:1112)
	at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:229)
	at $Proxy3.getProtocolVersion(Unknown Source)
	at org.apache.hadoop.ipc.RPC.getProxy(RPC.java:411)
	at org.apache.hadoop.ipc.RPC.getProxy(RPC.java:392)
	at org.apache.hadoop.ipc.RPC.getProxy(RPC.java:374)
	at org.apache.hadoop.hdfs.DFSClient.createClientDatanodeProtocolProxy(DFSClient.java:212)
	at org.apache.hadoop.hdfs.BlockReaderLocal$LocalDatanodeInfo.getDatanodeProxy(BlockReaderLocal.java:90)
	at org.apache.hadoop.hdfs.BlockReaderLocal$LocalDatanodeInfo.access$200(BlockReaderLocal.java:65)
	at org.apache.hadoop.hdfs.BlockReaderLocal.getBlockPathInfo(BlockReaderLocal.java:224)
	at org.apache.hadoop.hdfs.BlockReaderLocal.newBlockReader(BlockReaderLocal.java:145)
	at org.apache.hadoop.hdfs.DFSClient.getLocalBlockReader(DFSClient.java:509)
	at org.apache.hadoop.hdfs.DFSClient.access$800(DFSClient.java:78)
	at org.apache.hadoop.hdfs.DFSClient$DFSInputStream.blockSeekTo(DFSClient.java:2231)
	at org.apache.hadoop.hdfs.DFSClient$DFSInputStream.read(DFSClient.java:2384)
	at java.io.DataInputStream.read(DataInputStream.java:149)
	at java.io.BufferedInputStream.fill(BufferedInputStream.java:235)
	at java.io.BufferedInputStream.read(BufferedInputStream.java:254)
	at org.apache.pig.impl.io.BufferedPositionedInputStream.read(BufferedPositionedInputStream.java:52)
	at org.apache.pig.impl.io.InterRecordReader.nextKeyValue(InterRecordReader.java:86)
	at org.apache.pig.impl.io.InterStorage.getNext(InterStorage.java:77)
	at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigRecordReader.nextKeyValue(PigRecordReader.java:187)
	at org.apache.hadoop.mapred.MapTask$NewTrackingRecordReader.nextKeyValue(MapTask.java:533)
	at org.apache.hadoop.mapreduce.MapContext.nextKeyValue(MapContext.java:67)
	at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:143)
	at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:766)
	at org.apache.hadoop.mapred.MapTask.run(MapTask.java:370)
	at org.apache.hadoop.mapred.Child$4.run(Child.java:255)
	at java.security.AccessController.doPrivileged(Native Method)
	at javax.security.auth.Subject.doAs(Subject.java:415)
	at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1136)
	at org.apache.hadoop.mapred.Child.main(Child.java:249)
Caused by: java.net.SocketTimeoutException: 10000 millis timeout while waiting for channel to be ready for read. ch : java.nio.channels.SocketChannel[connected local=/10.130.110.80:57689 remote=/10.130.110.80:50020]
	at org.apache.hadoop.net.SocketIOWithTimeout.doIO(SocketIOWithTimeout.java:164)
	at org.apache.hadoop.net.SocketInputStream.read(SocketInputStream.java:155)
	at org.apache.hadoop.net.SocketInputStream.read(SocketInputStream.java:128)
	at java.io.FilterInputStream.read(FilterInputStream.java:133)
	at org.apache.hadoop.ipc.Client$Connection$PingInputStream.read(Client.java:361)
	at java.io.BufferedInputStream.fill(BufferedInputStream.java:235)
	at java.io.BufferedInputStream.read(BufferedInputStream.java:254)
	at java.io.DataInputStream.readInt(DataInputStream.java:387)
	at org.apache.hadoop.ipc.Client$Connection.receiveResponse(Client.java:841)
	at org.apache.hadoop.ipc.Client$Connection.run(Client.java:786)

I checked the source code, the exception happens here:
      //now wait for socket to be ready.
      int count = 0;
      try {
        count = selector.select(channel, ops, timeout);  
      } catch (IOException e) { //unexpected IOException.
        closed = true;
        throw e;
      } 

      if (count == 0) {
//here!!        throw new SocketTimeoutException(timeoutExceptionString(channel,
                                                                timeout, ops));
      }

	Why the selector selected nothing? the data node is not under heavy load , gc, network are all ok.
	
	Thanks.
	

Haitao Yao
yao.erix@gmail.com
weibo: @haitao_yao
Skype:  haitao.yao.final


Re: Socket timeout for BlockReaderLocal

Posted by Robert Molina <rm...@hortonworks.com>.
Hi Haitao,
To help isolate, what happens if you run a different job?  Also, if you
view the namenode webui or the specific datanode webui having the issue,
are there any indicators of it being down?

Regards,
Robert

On Tue, Dec 4, 2012 at 12:49 AM, panfei <cn...@gmail.com> wrote:

> I noticed that you are using jdk 1.7 , personally I prefer 1.6.x ;
> if your firewall is OK, you can check you RPC service to see if it is also
> OK; and test it by telnet  10.130.110.80 50020;
> I suggested hive because HQL(SQL-like) is familiar to most people, and the
> learning curve is smooth;
>
>
> 2012/12/4 Haitao Yao <ya...@gmail.com>
>
>> The firewall is OK.
>> Well, personally I prefer Pig. And it's a big project, switching pig to
>> hive is not an easy way.
>> thanks.
>>
>>   Haitao Yao
>> yao.erix@gmail.com
>> weibo: @haitao_yao
>> Skype:  haitao.yao.final
>>
>> On 2012-12-4, at 下午3:14, panfei <cn...@gmail.com> wrote:
>>
>> check your firewall settings plz.  and why not use hive to do work ?
>>
>>
>> 2012/12/4 Haitao Yao <ya...@gmail.com>
>>
>>> hi, all
>>> I's using Hadoop 1.2.0 , java version "1.7.0_05"
>>>  When running my pig script ,  the worker always report this error, and
>>> the MR jobs run very slow.
>>> Increase the dfs.socket.timeout value does not work. the network is ok,
>>> telnet to 50020 port is always ok.
>>>  here's the stacktrace:
>>>
>>> 2012-12-04 14:29:41,323 INFO org.apache.hadoop.hdfs.DFSClient: Failed to read blk_-2337696885631113108_11054058 on local machinejava.net.SocketTimeoutException: Call to /10.130.110.80:50020 failed on socket timeout exception: java.net.SocketTimeoutException: 10000 millis timeout while waiting for channel to be ready for read. ch : java.nio.channels.SocketChannel[connected local=/10.130.110.80:57689 remote=/10.130.110.80:50020]
>>> 	at org.apache.hadoop.ipc.Client.wrapException(Client.java:1140)
>>> 	at org.apache.hadoop.ipc.Client.call(Client.java:1112)
>>> 	at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:229)
>>> 	at $Proxy3.getProtocolVersion(Unknown Source)
>>> 	at org.apache.hadoop.ipc.RPC.getProxy(RPC.java:411)
>>> 	at org.apache.hadoop.ipc.RPC.getProxy(RPC.java:392)
>>> 	at org.apache.hadoop.ipc.RPC.getProxy(RPC.java:374)
>>> 	at org.apache.hadoop.hdfs.DFSClient.createClientDatanodeProtocolProxy(DFSClient.java:212)
>>> 	at org.apache.hadoop.hdfs.BlockReaderLocal$LocalDatanodeInfo.getDatanodeProxy(BlockReaderLocal.java:90)
>>> 	at org.apache.hadoop.hdfs.BlockReaderLocal$LocalDatanodeInfo.access$200(BlockReaderLocal.java:65)
>>> 	at org.apache.hadoop.hdfs.BlockReaderLocal.getBlockPathInfo(BlockReaderLocal.java:224)
>>> 	at org.apache.hadoop.hdfs.BlockReaderLocal.newBlockReader(BlockReaderLocal.java:145)
>>> 	at org.apache.hadoop.hdfs.DFSClient.getLocalBlockReader(DFSClient.java:509)
>>> 	at org.apache.hadoop.hdfs.DFSClient.access$800(DFSClient.java:78)
>>> 	at org.apache.hadoop.hdfs.DFSClient$DFSInputStream.blockSeekTo(DFSClient.java:2231)
>>> 	at org.apache.hadoop.hdfs.DFSClient$DFSInputStream.read(DFSClient.java:2384)
>>> 	at java.io.DataInputStream.read(DataInputStream.java:149)
>>> 	at java.io.BufferedInputStream.fill(BufferedInputStream.java:235)
>>> 	at java.io.BufferedInputStream.read(BufferedInputStream.java:254)
>>> 	at org.apache.pig.impl.io.BufferedPositionedInputStream.read(BufferedPositionedInputStream.java:52)
>>> 	at org.apache.pig.impl.io.InterRecordReader.nextKeyValue(InterRecordReader.java:86)
>>> 	at org.apache.pig.impl.io.InterStorage.getNext(InterStorage.java:77)
>>> 	at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigRecordReader.nextKeyValue(PigRecordReader.java:187)
>>> 	at org.apache.hadoop.mapred.MapTask$NewTrackingRecordReader.nextKeyValue(MapTask.java:533)
>>> 	at org.apache.hadoop.mapreduce.MapContext.nextKeyValue(MapContext.java:67)
>>> 	at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:143)
>>> 	at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:766)
>>> 	at org.apache.hadoop.mapred.MapTask.run(MapTask.java:370)
>>> 	at org.apache.hadoop.mapred.Child$4.run(Child.java:255)
>>> 	at java.security.AccessController.doPrivileged(Native Method)
>>> 	at javax.security.auth.Subject.doAs(Subject.java:415)
>>> 	at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1136)
>>> 	at org.apache.hadoop.mapred.Child.main(Child.java:249)
>>> Caused by: java.net.SocketTimeoutException: 10000 millis timeout while waiting for channel to be ready for read. ch : java.nio.channels.SocketChannel[connected local=/10.130.110.80:57689 remote=/10.130.110.80:50020]
>>> 	at org.apache.hadoop.net.SocketIOWithTimeout.doIO(SocketIOWithTimeout.java:164)
>>> 	at org.apache.hadoop.net.SocketInputStream.read(SocketInputStream.java:155)
>>> 	at org.apache.hadoop.net.SocketInputStream.read(SocketInputStream.java:128)
>>> 	at java.io.FilterInputStream.read(FilterInputStream.java:133)
>>> 	at org.apache.hadoop.ipc.Client$Connection$PingInputStream.read(Client.java:361)
>>> 	at java.io.BufferedInputStream.fill(BufferedInputStream.java:235)
>>> 	at java.io.BufferedInputStream.read(BufferedInputStream.java:254)
>>> 	at java.io.DataInputStream.readInt(DataInputStream.java:387)
>>> 	at org.apache.hadoop.ipc.Client$Connection.receiveResponse(Client.java:841)
>>> 	at org.apache.hadoop.ipc.Client$Connection.run(Client.java:786)
>>>
>>>
>>> I checked the source code, the exception happens here:
>>>       //now wait for socket to be ready.
>>>       *int* count = 0;
>>>       *try* {
>>>         count = *selector*.select(channel, ops, timeout);
>>>       } *catch* (IOException e) { //unexpected IOException.
>>>         closed = *true*;
>>>         *throw* e;
>>>       }
>>>
>>>       *if* (count == 0) {
>>> //here!!        *throw* *new* SocketTimeoutException(*
>>> timeoutExceptionString*(channel,
>>>                                                                 timeout,
>>> ops));
>>>       }
>>>
>>> Why the selector selected nothing? the data node is not under heavy load
>>> , gc, network are all ok.
>>>  Thanks.
>>>
>>>   Haitao Yao
>>> yao.erix@gmail.com
>>> weibo: @haitao_yao
>>> Skype:  haitao.yao.final
>>>
>>>
>>
>>
>> --
>> 不学习,不知道
>>
>>
>>
>
>
> --
> 不学习,不知道
>
>

Re: Socket timeout for BlockReaderLocal

Posted by Robert Molina <rm...@hortonworks.com>.
Hi Haitao,
To help isolate, what happens if you run a different job?  Also, if you
view the namenode webui or the specific datanode webui having the issue,
are there any indicators of it being down?

Regards,
Robert

On Tue, Dec 4, 2012 at 12:49 AM, panfei <cn...@gmail.com> wrote:

> I noticed that you are using jdk 1.7 , personally I prefer 1.6.x ;
> if your firewall is OK, you can check you RPC service to see if it is also
> OK; and test it by telnet  10.130.110.80 50020;
> I suggested hive because HQL(SQL-like) is familiar to most people, and the
> learning curve is smooth;
>
>
> 2012/12/4 Haitao Yao <ya...@gmail.com>
>
>> The firewall is OK.
>> Well, personally I prefer Pig. And it's a big project, switching pig to
>> hive is not an easy way.
>> thanks.
>>
>>   Haitao Yao
>> yao.erix@gmail.com
>> weibo: @haitao_yao
>> Skype:  haitao.yao.final
>>
>> On 2012-12-4, at 下午3:14, panfei <cn...@gmail.com> wrote:
>>
>> check your firewall settings plz.  and why not use hive to do work ?
>>
>>
>> 2012/12/4 Haitao Yao <ya...@gmail.com>
>>
>>> hi, all
>>> I's using Hadoop 1.2.0 , java version "1.7.0_05"
>>>  When running my pig script ,  the worker always report this error, and
>>> the MR jobs run very slow.
>>> Increase the dfs.socket.timeout value does not work. the network is ok,
>>> telnet to 50020 port is always ok.
>>>  here's the stacktrace:
>>>
>>> 2012-12-04 14:29:41,323 INFO org.apache.hadoop.hdfs.DFSClient: Failed to read blk_-2337696885631113108_11054058 on local machinejava.net.SocketTimeoutException: Call to /10.130.110.80:50020 failed on socket timeout exception: java.net.SocketTimeoutException: 10000 millis timeout while waiting for channel to be ready for read. ch : java.nio.channels.SocketChannel[connected local=/10.130.110.80:57689 remote=/10.130.110.80:50020]
>>> 	at org.apache.hadoop.ipc.Client.wrapException(Client.java:1140)
>>> 	at org.apache.hadoop.ipc.Client.call(Client.java:1112)
>>> 	at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:229)
>>> 	at $Proxy3.getProtocolVersion(Unknown Source)
>>> 	at org.apache.hadoop.ipc.RPC.getProxy(RPC.java:411)
>>> 	at org.apache.hadoop.ipc.RPC.getProxy(RPC.java:392)
>>> 	at org.apache.hadoop.ipc.RPC.getProxy(RPC.java:374)
>>> 	at org.apache.hadoop.hdfs.DFSClient.createClientDatanodeProtocolProxy(DFSClient.java:212)
>>> 	at org.apache.hadoop.hdfs.BlockReaderLocal$LocalDatanodeInfo.getDatanodeProxy(BlockReaderLocal.java:90)
>>> 	at org.apache.hadoop.hdfs.BlockReaderLocal$LocalDatanodeInfo.access$200(BlockReaderLocal.java:65)
>>> 	at org.apache.hadoop.hdfs.BlockReaderLocal.getBlockPathInfo(BlockReaderLocal.java:224)
>>> 	at org.apache.hadoop.hdfs.BlockReaderLocal.newBlockReader(BlockReaderLocal.java:145)
>>> 	at org.apache.hadoop.hdfs.DFSClient.getLocalBlockReader(DFSClient.java:509)
>>> 	at org.apache.hadoop.hdfs.DFSClient.access$800(DFSClient.java:78)
>>> 	at org.apache.hadoop.hdfs.DFSClient$DFSInputStream.blockSeekTo(DFSClient.java:2231)
>>> 	at org.apache.hadoop.hdfs.DFSClient$DFSInputStream.read(DFSClient.java:2384)
>>> 	at java.io.DataInputStream.read(DataInputStream.java:149)
>>> 	at java.io.BufferedInputStream.fill(BufferedInputStream.java:235)
>>> 	at java.io.BufferedInputStream.read(BufferedInputStream.java:254)
>>> 	at org.apache.pig.impl.io.BufferedPositionedInputStream.read(BufferedPositionedInputStream.java:52)
>>> 	at org.apache.pig.impl.io.InterRecordReader.nextKeyValue(InterRecordReader.java:86)
>>> 	at org.apache.pig.impl.io.InterStorage.getNext(InterStorage.java:77)
>>> 	at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigRecordReader.nextKeyValue(PigRecordReader.java:187)
>>> 	at org.apache.hadoop.mapred.MapTask$NewTrackingRecordReader.nextKeyValue(MapTask.java:533)
>>> 	at org.apache.hadoop.mapreduce.MapContext.nextKeyValue(MapContext.java:67)
>>> 	at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:143)
>>> 	at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:766)
>>> 	at org.apache.hadoop.mapred.MapTask.run(MapTask.java:370)
>>> 	at org.apache.hadoop.mapred.Child$4.run(Child.java:255)
>>> 	at java.security.AccessController.doPrivileged(Native Method)
>>> 	at javax.security.auth.Subject.doAs(Subject.java:415)
>>> 	at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1136)
>>> 	at org.apache.hadoop.mapred.Child.main(Child.java:249)
>>> Caused by: java.net.SocketTimeoutException: 10000 millis timeout while waiting for channel to be ready for read. ch : java.nio.channels.SocketChannel[connected local=/10.130.110.80:57689 remote=/10.130.110.80:50020]
>>> 	at org.apache.hadoop.net.SocketIOWithTimeout.doIO(SocketIOWithTimeout.java:164)
>>> 	at org.apache.hadoop.net.SocketInputStream.read(SocketInputStream.java:155)
>>> 	at org.apache.hadoop.net.SocketInputStream.read(SocketInputStream.java:128)
>>> 	at java.io.FilterInputStream.read(FilterInputStream.java:133)
>>> 	at org.apache.hadoop.ipc.Client$Connection$PingInputStream.read(Client.java:361)
>>> 	at java.io.BufferedInputStream.fill(BufferedInputStream.java:235)
>>> 	at java.io.BufferedInputStream.read(BufferedInputStream.java:254)
>>> 	at java.io.DataInputStream.readInt(DataInputStream.java:387)
>>> 	at org.apache.hadoop.ipc.Client$Connection.receiveResponse(Client.java:841)
>>> 	at org.apache.hadoop.ipc.Client$Connection.run(Client.java:786)
>>>
>>>
>>> I checked the source code, the exception happens here:
>>>       //now wait for socket to be ready.
>>>       *int* count = 0;
>>>       *try* {
>>>         count = *selector*.select(channel, ops, timeout);
>>>       } *catch* (IOException e) { //unexpected IOException.
>>>         closed = *true*;
>>>         *throw* e;
>>>       }
>>>
>>>       *if* (count == 0) {
>>> //here!!        *throw* *new* SocketTimeoutException(*
>>> timeoutExceptionString*(channel,
>>>                                                                 timeout,
>>> ops));
>>>       }
>>>
>>> Why the selector selected nothing? the data node is not under heavy load
>>> , gc, network are all ok.
>>>  Thanks.
>>>
>>>   Haitao Yao
>>> yao.erix@gmail.com
>>> weibo: @haitao_yao
>>> Skype:  haitao.yao.final
>>>
>>>
>>
>>
>> --
>> 不学习,不知道
>>
>>
>>
>
>
> --
> 不学习,不知道
>
>

Re: Socket timeout for BlockReaderLocal

Posted by Robert Molina <rm...@hortonworks.com>.
Hi Haitao,
To help isolate, what happens if you run a different job?  Also, if you
view the namenode webui or the specific datanode webui having the issue,
are there any indicators of it being down?

Regards,
Robert

On Tue, Dec 4, 2012 at 12:49 AM, panfei <cn...@gmail.com> wrote:

> I noticed that you are using jdk 1.7 , personally I prefer 1.6.x ;
> if your firewall is OK, you can check you RPC service to see if it is also
> OK; and test it by telnet  10.130.110.80 50020;
> I suggested hive because HQL(SQL-like) is familiar to most people, and the
> learning curve is smooth;
>
>
> 2012/12/4 Haitao Yao <ya...@gmail.com>
>
>> The firewall is OK.
>> Well, personally I prefer Pig. And it's a big project, switching pig to
>> hive is not an easy way.
>> thanks.
>>
>>   Haitao Yao
>> yao.erix@gmail.com
>> weibo: @haitao_yao
>> Skype:  haitao.yao.final
>>
>> On 2012-12-4, at 下午3:14, panfei <cn...@gmail.com> wrote:
>>
>> check your firewall settings plz.  and why not use hive to do work ?
>>
>>
>> 2012/12/4 Haitao Yao <ya...@gmail.com>
>>
>>> hi, all
>>> I's using Hadoop 1.2.0 , java version "1.7.0_05"
>>>  When running my pig script ,  the worker always report this error, and
>>> the MR jobs run very slow.
>>> Increase the dfs.socket.timeout value does not work. the network is ok,
>>> telnet to 50020 port is always ok.
>>>  here's the stacktrace:
>>>
>>> 2012-12-04 14:29:41,323 INFO org.apache.hadoop.hdfs.DFSClient: Failed to read blk_-2337696885631113108_11054058 on local machinejava.net.SocketTimeoutException: Call to /10.130.110.80:50020 failed on socket timeout exception: java.net.SocketTimeoutException: 10000 millis timeout while waiting for channel to be ready for read. ch : java.nio.channels.SocketChannel[connected local=/10.130.110.80:57689 remote=/10.130.110.80:50020]
>>> 	at org.apache.hadoop.ipc.Client.wrapException(Client.java:1140)
>>> 	at org.apache.hadoop.ipc.Client.call(Client.java:1112)
>>> 	at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:229)
>>> 	at $Proxy3.getProtocolVersion(Unknown Source)
>>> 	at org.apache.hadoop.ipc.RPC.getProxy(RPC.java:411)
>>> 	at org.apache.hadoop.ipc.RPC.getProxy(RPC.java:392)
>>> 	at org.apache.hadoop.ipc.RPC.getProxy(RPC.java:374)
>>> 	at org.apache.hadoop.hdfs.DFSClient.createClientDatanodeProtocolProxy(DFSClient.java:212)
>>> 	at org.apache.hadoop.hdfs.BlockReaderLocal$LocalDatanodeInfo.getDatanodeProxy(BlockReaderLocal.java:90)
>>> 	at org.apache.hadoop.hdfs.BlockReaderLocal$LocalDatanodeInfo.access$200(BlockReaderLocal.java:65)
>>> 	at org.apache.hadoop.hdfs.BlockReaderLocal.getBlockPathInfo(BlockReaderLocal.java:224)
>>> 	at org.apache.hadoop.hdfs.BlockReaderLocal.newBlockReader(BlockReaderLocal.java:145)
>>> 	at org.apache.hadoop.hdfs.DFSClient.getLocalBlockReader(DFSClient.java:509)
>>> 	at org.apache.hadoop.hdfs.DFSClient.access$800(DFSClient.java:78)
>>> 	at org.apache.hadoop.hdfs.DFSClient$DFSInputStream.blockSeekTo(DFSClient.java:2231)
>>> 	at org.apache.hadoop.hdfs.DFSClient$DFSInputStream.read(DFSClient.java:2384)
>>> 	at java.io.DataInputStream.read(DataInputStream.java:149)
>>> 	at java.io.BufferedInputStream.fill(BufferedInputStream.java:235)
>>> 	at java.io.BufferedInputStream.read(BufferedInputStream.java:254)
>>> 	at org.apache.pig.impl.io.BufferedPositionedInputStream.read(BufferedPositionedInputStream.java:52)
>>> 	at org.apache.pig.impl.io.InterRecordReader.nextKeyValue(InterRecordReader.java:86)
>>> 	at org.apache.pig.impl.io.InterStorage.getNext(InterStorage.java:77)
>>> 	at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigRecordReader.nextKeyValue(PigRecordReader.java:187)
>>> 	at org.apache.hadoop.mapred.MapTask$NewTrackingRecordReader.nextKeyValue(MapTask.java:533)
>>> 	at org.apache.hadoop.mapreduce.MapContext.nextKeyValue(MapContext.java:67)
>>> 	at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:143)
>>> 	at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:766)
>>> 	at org.apache.hadoop.mapred.MapTask.run(MapTask.java:370)
>>> 	at org.apache.hadoop.mapred.Child$4.run(Child.java:255)
>>> 	at java.security.AccessController.doPrivileged(Native Method)
>>> 	at javax.security.auth.Subject.doAs(Subject.java:415)
>>> 	at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1136)
>>> 	at org.apache.hadoop.mapred.Child.main(Child.java:249)
>>> Caused by: java.net.SocketTimeoutException: 10000 millis timeout while waiting for channel to be ready for read. ch : java.nio.channels.SocketChannel[connected local=/10.130.110.80:57689 remote=/10.130.110.80:50020]
>>> 	at org.apache.hadoop.net.SocketIOWithTimeout.doIO(SocketIOWithTimeout.java:164)
>>> 	at org.apache.hadoop.net.SocketInputStream.read(SocketInputStream.java:155)
>>> 	at org.apache.hadoop.net.SocketInputStream.read(SocketInputStream.java:128)
>>> 	at java.io.FilterInputStream.read(FilterInputStream.java:133)
>>> 	at org.apache.hadoop.ipc.Client$Connection$PingInputStream.read(Client.java:361)
>>> 	at java.io.BufferedInputStream.fill(BufferedInputStream.java:235)
>>> 	at java.io.BufferedInputStream.read(BufferedInputStream.java:254)
>>> 	at java.io.DataInputStream.readInt(DataInputStream.java:387)
>>> 	at org.apache.hadoop.ipc.Client$Connection.receiveResponse(Client.java:841)
>>> 	at org.apache.hadoop.ipc.Client$Connection.run(Client.java:786)
>>>
>>>
>>> I checked the source code, the exception happens here:
>>>       //now wait for socket to be ready.
>>>       *int* count = 0;
>>>       *try* {
>>>         count = *selector*.select(channel, ops, timeout);
>>>       } *catch* (IOException e) { //unexpected IOException.
>>>         closed = *true*;
>>>         *throw* e;
>>>       }
>>>
>>>       *if* (count == 0) {
>>> //here!!        *throw* *new* SocketTimeoutException(*
>>> timeoutExceptionString*(channel,
>>>                                                                 timeout,
>>> ops));
>>>       }
>>>
>>> Why the selector selected nothing? the data node is not under heavy load
>>> , gc, network are all ok.
>>>  Thanks.
>>>
>>>   Haitao Yao
>>> yao.erix@gmail.com
>>> weibo: @haitao_yao
>>> Skype:  haitao.yao.final
>>>
>>>
>>
>>
>> --
>> 不学习,不知道
>>
>>
>>
>
>
> --
> 不学习,不知道
>
>

Re: Socket timeout for BlockReaderLocal

Posted by Robert Molina <rm...@hortonworks.com>.
Hi Haitao,
To help isolate, what happens if you run a different job?  Also, if you
view the namenode webui or the specific datanode webui having the issue,
are there any indicators of it being down?

Regards,
Robert

On Tue, Dec 4, 2012 at 12:49 AM, panfei <cn...@gmail.com> wrote:

> I noticed that you are using jdk 1.7 , personally I prefer 1.6.x ;
> if your firewall is OK, you can check you RPC service to see if it is also
> OK; and test it by telnet  10.130.110.80 50020;
> I suggested hive because HQL(SQL-like) is familiar to most people, and the
> learning curve is smooth;
>
>
> 2012/12/4 Haitao Yao <ya...@gmail.com>
>
>> The firewall is OK.
>> Well, personally I prefer Pig. And it's a big project, switching pig to
>> hive is not an easy way.
>> thanks.
>>
>>   Haitao Yao
>> yao.erix@gmail.com
>> weibo: @haitao_yao
>> Skype:  haitao.yao.final
>>
>> On 2012-12-4, at 下午3:14, panfei <cn...@gmail.com> wrote:
>>
>> check your firewall settings plz.  and why not use hive to do work ?
>>
>>
>> 2012/12/4 Haitao Yao <ya...@gmail.com>
>>
>>> hi, all
>>> I's using Hadoop 1.2.0 , java version "1.7.0_05"
>>>  When running my pig script ,  the worker always report this error, and
>>> the MR jobs run very slow.
>>> Increase the dfs.socket.timeout value does not work. the network is ok,
>>> telnet to 50020 port is always ok.
>>>  here's the stacktrace:
>>>
>>> 2012-12-04 14:29:41,323 INFO org.apache.hadoop.hdfs.DFSClient: Failed to read blk_-2337696885631113108_11054058 on local machinejava.net.SocketTimeoutException: Call to /10.130.110.80:50020 failed on socket timeout exception: java.net.SocketTimeoutException: 10000 millis timeout while waiting for channel to be ready for read. ch : java.nio.channels.SocketChannel[connected local=/10.130.110.80:57689 remote=/10.130.110.80:50020]
>>> 	at org.apache.hadoop.ipc.Client.wrapException(Client.java:1140)
>>> 	at org.apache.hadoop.ipc.Client.call(Client.java:1112)
>>> 	at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:229)
>>> 	at $Proxy3.getProtocolVersion(Unknown Source)
>>> 	at org.apache.hadoop.ipc.RPC.getProxy(RPC.java:411)
>>> 	at org.apache.hadoop.ipc.RPC.getProxy(RPC.java:392)
>>> 	at org.apache.hadoop.ipc.RPC.getProxy(RPC.java:374)
>>> 	at org.apache.hadoop.hdfs.DFSClient.createClientDatanodeProtocolProxy(DFSClient.java:212)
>>> 	at org.apache.hadoop.hdfs.BlockReaderLocal$LocalDatanodeInfo.getDatanodeProxy(BlockReaderLocal.java:90)
>>> 	at org.apache.hadoop.hdfs.BlockReaderLocal$LocalDatanodeInfo.access$200(BlockReaderLocal.java:65)
>>> 	at org.apache.hadoop.hdfs.BlockReaderLocal.getBlockPathInfo(BlockReaderLocal.java:224)
>>> 	at org.apache.hadoop.hdfs.BlockReaderLocal.newBlockReader(BlockReaderLocal.java:145)
>>> 	at org.apache.hadoop.hdfs.DFSClient.getLocalBlockReader(DFSClient.java:509)
>>> 	at org.apache.hadoop.hdfs.DFSClient.access$800(DFSClient.java:78)
>>> 	at org.apache.hadoop.hdfs.DFSClient$DFSInputStream.blockSeekTo(DFSClient.java:2231)
>>> 	at org.apache.hadoop.hdfs.DFSClient$DFSInputStream.read(DFSClient.java:2384)
>>> 	at java.io.DataInputStream.read(DataInputStream.java:149)
>>> 	at java.io.BufferedInputStream.fill(BufferedInputStream.java:235)
>>> 	at java.io.BufferedInputStream.read(BufferedInputStream.java:254)
>>> 	at org.apache.pig.impl.io.BufferedPositionedInputStream.read(BufferedPositionedInputStream.java:52)
>>> 	at org.apache.pig.impl.io.InterRecordReader.nextKeyValue(InterRecordReader.java:86)
>>> 	at org.apache.pig.impl.io.InterStorage.getNext(InterStorage.java:77)
>>> 	at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigRecordReader.nextKeyValue(PigRecordReader.java:187)
>>> 	at org.apache.hadoop.mapred.MapTask$NewTrackingRecordReader.nextKeyValue(MapTask.java:533)
>>> 	at org.apache.hadoop.mapreduce.MapContext.nextKeyValue(MapContext.java:67)
>>> 	at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:143)
>>> 	at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:766)
>>> 	at org.apache.hadoop.mapred.MapTask.run(MapTask.java:370)
>>> 	at org.apache.hadoop.mapred.Child$4.run(Child.java:255)
>>> 	at java.security.AccessController.doPrivileged(Native Method)
>>> 	at javax.security.auth.Subject.doAs(Subject.java:415)
>>> 	at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1136)
>>> 	at org.apache.hadoop.mapred.Child.main(Child.java:249)
>>> Caused by: java.net.SocketTimeoutException: 10000 millis timeout while waiting for channel to be ready for read. ch : java.nio.channels.SocketChannel[connected local=/10.130.110.80:57689 remote=/10.130.110.80:50020]
>>> 	at org.apache.hadoop.net.SocketIOWithTimeout.doIO(SocketIOWithTimeout.java:164)
>>> 	at org.apache.hadoop.net.SocketInputStream.read(SocketInputStream.java:155)
>>> 	at org.apache.hadoop.net.SocketInputStream.read(SocketInputStream.java:128)
>>> 	at java.io.FilterInputStream.read(FilterInputStream.java:133)
>>> 	at org.apache.hadoop.ipc.Client$Connection$PingInputStream.read(Client.java:361)
>>> 	at java.io.BufferedInputStream.fill(BufferedInputStream.java:235)
>>> 	at java.io.BufferedInputStream.read(BufferedInputStream.java:254)
>>> 	at java.io.DataInputStream.readInt(DataInputStream.java:387)
>>> 	at org.apache.hadoop.ipc.Client$Connection.receiveResponse(Client.java:841)
>>> 	at org.apache.hadoop.ipc.Client$Connection.run(Client.java:786)
>>>
>>>
>>> I checked the source code, the exception happens here:
>>>       //now wait for socket to be ready.
>>>       *int* count = 0;
>>>       *try* {
>>>         count = *selector*.select(channel, ops, timeout);
>>>       } *catch* (IOException e) { //unexpected IOException.
>>>         closed = *true*;
>>>         *throw* e;
>>>       }
>>>
>>>       *if* (count == 0) {
>>> //here!!        *throw* *new* SocketTimeoutException(*
>>> timeoutExceptionString*(channel,
>>>                                                                 timeout,
>>> ops));
>>>       }
>>>
>>> Why the selector selected nothing? the data node is not under heavy load
>>> , gc, network are all ok.
>>>  Thanks.
>>>
>>>   Haitao Yao
>>> yao.erix@gmail.com
>>> weibo: @haitao_yao
>>> Skype:  haitao.yao.final
>>>
>>>
>>
>>
>> --
>> 不学习,不知道
>>
>>
>>
>
>
> --
> 不学习,不知道
>
>

Re: Socket timeout for BlockReaderLocal

Posted by panfei <cn...@gmail.com>.
I noticed that you are using jdk 1.7 , personally I prefer 1.6.x ;
if your firewall is OK, you can check you RPC service to see if it is also
OK; and test it by telnet  10.130.110.80 50020;
I suggested hive because HQL(SQL-like) is familiar to most people, and the
learning curve is smooth;


2012/12/4 Haitao Yao <ya...@gmail.com>

> The firewall is OK.
> Well, personally I prefer Pig. And it's a big project, switching pig to
> hive is not an easy way.
> thanks.
>
> Haitao Yao
> yao.erix@gmail.com
> weibo: @haitao_yao
> Skype:  haitao.yao.final
>
> On 2012-12-4, at 下午3:14, panfei <cn...@gmail.com> wrote:
>
> check your firewall settings plz.  and why not use hive to do work ?
>
>
> 2012/12/4 Haitao Yao <ya...@gmail.com>
>
>> hi, all
>> I's using Hadoop 1.2.0 , java version "1.7.0_05"
>>  When running my pig script ,  the worker always report this error, and
>> the MR jobs run very slow.
>> Increase the dfs.socket.timeout value does not work. the network is ok,
>> telnet to 50020 port is always ok.
>>  here's the stacktrace:
>>
>> 2012-12-04 14:29:41,323 INFO org.apache.hadoop.hdfs.DFSClient: Failed to read blk_-2337696885631113108_11054058 on local machinejava.net.SocketTimeoutException: Call to /10.130.110.80:50020 failed on socket timeout exception: java.net.SocketTimeoutException: 10000 millis timeout while waiting for channel to be ready for read. ch : java.nio.channels.SocketChannel[connected local=/10.130.110.80:57689 remote=/10.130.110.80:50020]
>> 	at org.apache.hadoop.ipc.Client.wrapException(Client.java:1140)
>> 	at org.apache.hadoop.ipc.Client.call(Client.java:1112)
>> 	at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:229)
>> 	at $Proxy3.getProtocolVersion(Unknown Source)
>> 	at org.apache.hadoop.ipc.RPC.getProxy(RPC.java:411)
>> 	at org.apache.hadoop.ipc.RPC.getProxy(RPC.java:392)
>> 	at org.apache.hadoop.ipc.RPC.getProxy(RPC.java:374)
>> 	at org.apache.hadoop.hdfs.DFSClient.createClientDatanodeProtocolProxy(DFSClient.java:212)
>> 	at org.apache.hadoop.hdfs.BlockReaderLocal$LocalDatanodeInfo.getDatanodeProxy(BlockReaderLocal.java:90)
>> 	at org.apache.hadoop.hdfs.BlockReaderLocal$LocalDatanodeInfo.access$200(BlockReaderLocal.java:65)
>> 	at org.apache.hadoop.hdfs.BlockReaderLocal.getBlockPathInfo(BlockReaderLocal.java:224)
>> 	at org.apache.hadoop.hdfs.BlockReaderLocal.newBlockReader(BlockReaderLocal.java:145)
>> 	at org.apache.hadoop.hdfs.DFSClient.getLocalBlockReader(DFSClient.java:509)
>> 	at org.apache.hadoop.hdfs.DFSClient.access$800(DFSClient.java:78)
>> 	at org.apache.hadoop.hdfs.DFSClient$DFSInputStream.blockSeekTo(DFSClient.java:2231)
>> 	at org.apache.hadoop.hdfs.DFSClient$DFSInputStream.read(DFSClient.java:2384)
>> 	at java.io.DataInputStream.read(DataInputStream.java:149)
>> 	at java.io.BufferedInputStream.fill(BufferedInputStream.java:235)
>> 	at java.io.BufferedInputStream.read(BufferedInputStream.java:254)
>> 	at org.apache.pig.impl.io.BufferedPositionedInputStream.read(BufferedPositionedInputStream.java:52)
>> 	at org.apache.pig.impl.io.InterRecordReader.nextKeyValue(InterRecordReader.java:86)
>> 	at org.apache.pig.impl.io.InterStorage.getNext(InterStorage.java:77)
>> 	at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigRecordReader.nextKeyValue(PigRecordReader.java:187)
>> 	at org.apache.hadoop.mapred.MapTask$NewTrackingRecordReader.nextKeyValue(MapTask.java:533)
>> 	at org.apache.hadoop.mapreduce.MapContext.nextKeyValue(MapContext.java:67)
>> 	at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:143)
>> 	at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:766)
>> 	at org.apache.hadoop.mapred.MapTask.run(MapTask.java:370)
>> 	at org.apache.hadoop.mapred.Child$4.run(Child.java:255)
>> 	at java.security.AccessController.doPrivileged(Native Method)
>> 	at javax.security.auth.Subject.doAs(Subject.java:415)
>> 	at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1136)
>> 	at org.apache.hadoop.mapred.Child.main(Child.java:249)
>> Caused by: java.net.SocketTimeoutException: 10000 millis timeout while waiting for channel to be ready for read. ch : java.nio.channels.SocketChannel[connected local=/10.130.110.80:57689 remote=/10.130.110.80:50020]
>> 	at org.apache.hadoop.net.SocketIOWithTimeout.doIO(SocketIOWithTimeout.java:164)
>> 	at org.apache.hadoop.net.SocketInputStream.read(SocketInputStream.java:155)
>> 	at org.apache.hadoop.net.SocketInputStream.read(SocketInputStream.java:128)
>> 	at java.io.FilterInputStream.read(FilterInputStream.java:133)
>> 	at org.apache.hadoop.ipc.Client$Connection$PingInputStream.read(Client.java:361)
>> 	at java.io.BufferedInputStream.fill(BufferedInputStream.java:235)
>> 	at java.io.BufferedInputStream.read(BufferedInputStream.java:254)
>> 	at java.io.DataInputStream.readInt(DataInputStream.java:387)
>> 	at org.apache.hadoop.ipc.Client$Connection.receiveResponse(Client.java:841)
>> 	at org.apache.hadoop.ipc.Client$Connection.run(Client.java:786)
>>
>>
>> I checked the source code, the exception happens here:
>>       //now wait for socket to be ready.
>>       *int* count = 0;
>>       *try* {
>>         count = *selector*.select(channel, ops, timeout);
>>       } *catch* (IOException e) { //unexpected IOException.
>>         closed = *true*;
>>         *throw* e;
>>       }
>>
>>       *if* (count == 0) {
>> //here!!        *throw* *new* SocketTimeoutException(*
>> timeoutExceptionString*(channel,
>>                                                                 timeout,
>> ops));
>>       }
>>
>> Why the selector selected nothing? the data node is not under heavy load
>> , gc, network are all ok.
>>  Thanks.
>>
>>   Haitao Yao
>> yao.erix@gmail.com
>> weibo: @haitao_yao
>> Skype:  haitao.yao.final
>>
>>
>
>
> --
> 不学习,不知道
>
>
>


-- 
不学习,不知道

Re: Socket timeout for BlockReaderLocal

Posted by panfei <cn...@gmail.com>.
I noticed that you are using jdk 1.7 , personally I prefer 1.6.x ;
if your firewall is OK, you can check you RPC service to see if it is also
OK; and test it by telnet  10.130.110.80 50020;
I suggested hive because HQL(SQL-like) is familiar to most people, and the
learning curve is smooth;


2012/12/4 Haitao Yao <ya...@gmail.com>

> The firewall is OK.
> Well, personally I prefer Pig. And it's a big project, switching pig to
> hive is not an easy way.
> thanks.
>
> Haitao Yao
> yao.erix@gmail.com
> weibo: @haitao_yao
> Skype:  haitao.yao.final
>
> On 2012-12-4, at 下午3:14, panfei <cn...@gmail.com> wrote:
>
> check your firewall settings plz.  and why not use hive to do work ?
>
>
> 2012/12/4 Haitao Yao <ya...@gmail.com>
>
>> hi, all
>> I's using Hadoop 1.2.0 , java version "1.7.0_05"
>>  When running my pig script ,  the worker always report this error, and
>> the MR jobs run very slow.
>> Increase the dfs.socket.timeout value does not work. the network is ok,
>> telnet to 50020 port is always ok.
>>  here's the stacktrace:
>>
>> 2012-12-04 14:29:41,323 INFO org.apache.hadoop.hdfs.DFSClient: Failed to read blk_-2337696885631113108_11054058 on local machinejava.net.SocketTimeoutException: Call to /10.130.110.80:50020 failed on socket timeout exception: java.net.SocketTimeoutException: 10000 millis timeout while waiting for channel to be ready for read. ch : java.nio.channels.SocketChannel[connected local=/10.130.110.80:57689 remote=/10.130.110.80:50020]
>> 	at org.apache.hadoop.ipc.Client.wrapException(Client.java:1140)
>> 	at org.apache.hadoop.ipc.Client.call(Client.java:1112)
>> 	at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:229)
>> 	at $Proxy3.getProtocolVersion(Unknown Source)
>> 	at org.apache.hadoop.ipc.RPC.getProxy(RPC.java:411)
>> 	at org.apache.hadoop.ipc.RPC.getProxy(RPC.java:392)
>> 	at org.apache.hadoop.ipc.RPC.getProxy(RPC.java:374)
>> 	at org.apache.hadoop.hdfs.DFSClient.createClientDatanodeProtocolProxy(DFSClient.java:212)
>> 	at org.apache.hadoop.hdfs.BlockReaderLocal$LocalDatanodeInfo.getDatanodeProxy(BlockReaderLocal.java:90)
>> 	at org.apache.hadoop.hdfs.BlockReaderLocal$LocalDatanodeInfo.access$200(BlockReaderLocal.java:65)
>> 	at org.apache.hadoop.hdfs.BlockReaderLocal.getBlockPathInfo(BlockReaderLocal.java:224)
>> 	at org.apache.hadoop.hdfs.BlockReaderLocal.newBlockReader(BlockReaderLocal.java:145)
>> 	at org.apache.hadoop.hdfs.DFSClient.getLocalBlockReader(DFSClient.java:509)
>> 	at org.apache.hadoop.hdfs.DFSClient.access$800(DFSClient.java:78)
>> 	at org.apache.hadoop.hdfs.DFSClient$DFSInputStream.blockSeekTo(DFSClient.java:2231)
>> 	at org.apache.hadoop.hdfs.DFSClient$DFSInputStream.read(DFSClient.java:2384)
>> 	at java.io.DataInputStream.read(DataInputStream.java:149)
>> 	at java.io.BufferedInputStream.fill(BufferedInputStream.java:235)
>> 	at java.io.BufferedInputStream.read(BufferedInputStream.java:254)
>> 	at org.apache.pig.impl.io.BufferedPositionedInputStream.read(BufferedPositionedInputStream.java:52)
>> 	at org.apache.pig.impl.io.InterRecordReader.nextKeyValue(InterRecordReader.java:86)
>> 	at org.apache.pig.impl.io.InterStorage.getNext(InterStorage.java:77)
>> 	at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigRecordReader.nextKeyValue(PigRecordReader.java:187)
>> 	at org.apache.hadoop.mapred.MapTask$NewTrackingRecordReader.nextKeyValue(MapTask.java:533)
>> 	at org.apache.hadoop.mapreduce.MapContext.nextKeyValue(MapContext.java:67)
>> 	at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:143)
>> 	at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:766)
>> 	at org.apache.hadoop.mapred.MapTask.run(MapTask.java:370)
>> 	at org.apache.hadoop.mapred.Child$4.run(Child.java:255)
>> 	at java.security.AccessController.doPrivileged(Native Method)
>> 	at javax.security.auth.Subject.doAs(Subject.java:415)
>> 	at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1136)
>> 	at org.apache.hadoop.mapred.Child.main(Child.java:249)
>> Caused by: java.net.SocketTimeoutException: 10000 millis timeout while waiting for channel to be ready for read. ch : java.nio.channels.SocketChannel[connected local=/10.130.110.80:57689 remote=/10.130.110.80:50020]
>> 	at org.apache.hadoop.net.SocketIOWithTimeout.doIO(SocketIOWithTimeout.java:164)
>> 	at org.apache.hadoop.net.SocketInputStream.read(SocketInputStream.java:155)
>> 	at org.apache.hadoop.net.SocketInputStream.read(SocketInputStream.java:128)
>> 	at java.io.FilterInputStream.read(FilterInputStream.java:133)
>> 	at org.apache.hadoop.ipc.Client$Connection$PingInputStream.read(Client.java:361)
>> 	at java.io.BufferedInputStream.fill(BufferedInputStream.java:235)
>> 	at java.io.BufferedInputStream.read(BufferedInputStream.java:254)
>> 	at java.io.DataInputStream.readInt(DataInputStream.java:387)
>> 	at org.apache.hadoop.ipc.Client$Connection.receiveResponse(Client.java:841)
>> 	at org.apache.hadoop.ipc.Client$Connection.run(Client.java:786)
>>
>>
>> I checked the source code, the exception happens here:
>>       //now wait for socket to be ready.
>>       *int* count = 0;
>>       *try* {
>>         count = *selector*.select(channel, ops, timeout);
>>       } *catch* (IOException e) { //unexpected IOException.
>>         closed = *true*;
>>         *throw* e;
>>       }
>>
>>       *if* (count == 0) {
>> //here!!        *throw* *new* SocketTimeoutException(*
>> timeoutExceptionString*(channel,
>>                                                                 timeout,
>> ops));
>>       }
>>
>> Why the selector selected nothing? the data node is not under heavy load
>> , gc, network are all ok.
>>  Thanks.
>>
>>   Haitao Yao
>> yao.erix@gmail.com
>> weibo: @haitao_yao
>> Skype:  haitao.yao.final
>>
>>
>
>
> --
> 不学习,不知道
>
>
>


-- 
不学习,不知道

Re: Socket timeout for BlockReaderLocal

Posted by panfei <cn...@gmail.com>.
I noticed that you are using jdk 1.7 , personally I prefer 1.6.x ;
if your firewall is OK, you can check you RPC service to see if it is also
OK; and test it by telnet  10.130.110.80 50020;
I suggested hive because HQL(SQL-like) is familiar to most people, and the
learning curve is smooth;


2012/12/4 Haitao Yao <ya...@gmail.com>

> The firewall is OK.
> Well, personally I prefer Pig. And it's a big project, switching pig to
> hive is not an easy way.
> thanks.
>
> Haitao Yao
> yao.erix@gmail.com
> weibo: @haitao_yao
> Skype:  haitao.yao.final
>
> On 2012-12-4, at 下午3:14, panfei <cn...@gmail.com> wrote:
>
> check your firewall settings plz.  and why not use hive to do work ?
>
>
> 2012/12/4 Haitao Yao <ya...@gmail.com>
>
>> hi, all
>> I's using Hadoop 1.2.0 , java version "1.7.0_05"
>>  When running my pig script ,  the worker always report this error, and
>> the MR jobs run very slow.
>> Increase the dfs.socket.timeout value does not work. the network is ok,
>> telnet to 50020 port is always ok.
>>  here's the stacktrace:
>>
>> 2012-12-04 14:29:41,323 INFO org.apache.hadoop.hdfs.DFSClient: Failed to read blk_-2337696885631113108_11054058 on local machinejava.net.SocketTimeoutException: Call to /10.130.110.80:50020 failed on socket timeout exception: java.net.SocketTimeoutException: 10000 millis timeout while waiting for channel to be ready for read. ch : java.nio.channels.SocketChannel[connected local=/10.130.110.80:57689 remote=/10.130.110.80:50020]
>> 	at org.apache.hadoop.ipc.Client.wrapException(Client.java:1140)
>> 	at org.apache.hadoop.ipc.Client.call(Client.java:1112)
>> 	at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:229)
>> 	at $Proxy3.getProtocolVersion(Unknown Source)
>> 	at org.apache.hadoop.ipc.RPC.getProxy(RPC.java:411)
>> 	at org.apache.hadoop.ipc.RPC.getProxy(RPC.java:392)
>> 	at org.apache.hadoop.ipc.RPC.getProxy(RPC.java:374)
>> 	at org.apache.hadoop.hdfs.DFSClient.createClientDatanodeProtocolProxy(DFSClient.java:212)
>> 	at org.apache.hadoop.hdfs.BlockReaderLocal$LocalDatanodeInfo.getDatanodeProxy(BlockReaderLocal.java:90)
>> 	at org.apache.hadoop.hdfs.BlockReaderLocal$LocalDatanodeInfo.access$200(BlockReaderLocal.java:65)
>> 	at org.apache.hadoop.hdfs.BlockReaderLocal.getBlockPathInfo(BlockReaderLocal.java:224)
>> 	at org.apache.hadoop.hdfs.BlockReaderLocal.newBlockReader(BlockReaderLocal.java:145)
>> 	at org.apache.hadoop.hdfs.DFSClient.getLocalBlockReader(DFSClient.java:509)
>> 	at org.apache.hadoop.hdfs.DFSClient.access$800(DFSClient.java:78)
>> 	at org.apache.hadoop.hdfs.DFSClient$DFSInputStream.blockSeekTo(DFSClient.java:2231)
>> 	at org.apache.hadoop.hdfs.DFSClient$DFSInputStream.read(DFSClient.java:2384)
>> 	at java.io.DataInputStream.read(DataInputStream.java:149)
>> 	at java.io.BufferedInputStream.fill(BufferedInputStream.java:235)
>> 	at java.io.BufferedInputStream.read(BufferedInputStream.java:254)
>> 	at org.apache.pig.impl.io.BufferedPositionedInputStream.read(BufferedPositionedInputStream.java:52)
>> 	at org.apache.pig.impl.io.InterRecordReader.nextKeyValue(InterRecordReader.java:86)
>> 	at org.apache.pig.impl.io.InterStorage.getNext(InterStorage.java:77)
>> 	at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigRecordReader.nextKeyValue(PigRecordReader.java:187)
>> 	at org.apache.hadoop.mapred.MapTask$NewTrackingRecordReader.nextKeyValue(MapTask.java:533)
>> 	at org.apache.hadoop.mapreduce.MapContext.nextKeyValue(MapContext.java:67)
>> 	at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:143)
>> 	at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:766)
>> 	at org.apache.hadoop.mapred.MapTask.run(MapTask.java:370)
>> 	at org.apache.hadoop.mapred.Child$4.run(Child.java:255)
>> 	at java.security.AccessController.doPrivileged(Native Method)
>> 	at javax.security.auth.Subject.doAs(Subject.java:415)
>> 	at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1136)
>> 	at org.apache.hadoop.mapred.Child.main(Child.java:249)
>> Caused by: java.net.SocketTimeoutException: 10000 millis timeout while waiting for channel to be ready for read. ch : java.nio.channels.SocketChannel[connected local=/10.130.110.80:57689 remote=/10.130.110.80:50020]
>> 	at org.apache.hadoop.net.SocketIOWithTimeout.doIO(SocketIOWithTimeout.java:164)
>> 	at org.apache.hadoop.net.SocketInputStream.read(SocketInputStream.java:155)
>> 	at org.apache.hadoop.net.SocketInputStream.read(SocketInputStream.java:128)
>> 	at java.io.FilterInputStream.read(FilterInputStream.java:133)
>> 	at org.apache.hadoop.ipc.Client$Connection$PingInputStream.read(Client.java:361)
>> 	at java.io.BufferedInputStream.fill(BufferedInputStream.java:235)
>> 	at java.io.BufferedInputStream.read(BufferedInputStream.java:254)
>> 	at java.io.DataInputStream.readInt(DataInputStream.java:387)
>> 	at org.apache.hadoop.ipc.Client$Connection.receiveResponse(Client.java:841)
>> 	at org.apache.hadoop.ipc.Client$Connection.run(Client.java:786)
>>
>>
>> I checked the source code, the exception happens here:
>>       //now wait for socket to be ready.
>>       *int* count = 0;
>>       *try* {
>>         count = *selector*.select(channel, ops, timeout);
>>       } *catch* (IOException e) { //unexpected IOException.
>>         closed = *true*;
>>         *throw* e;
>>       }
>>
>>       *if* (count == 0) {
>> //here!!        *throw* *new* SocketTimeoutException(*
>> timeoutExceptionString*(channel,
>>                                                                 timeout,
>> ops));
>>       }
>>
>> Why the selector selected nothing? the data node is not under heavy load
>> , gc, network are all ok.
>>  Thanks.
>>
>>   Haitao Yao
>> yao.erix@gmail.com
>> weibo: @haitao_yao
>> Skype:  haitao.yao.final
>>
>>
>
>
> --
> 不学习,不知道
>
>
>


-- 
不学习,不知道

Re: Socket timeout for BlockReaderLocal

Posted by panfei <cn...@gmail.com>.
I noticed that you are using jdk 1.7 , personally I prefer 1.6.x ;
if your firewall is OK, you can check you RPC service to see if it is also
OK; and test it by telnet  10.130.110.80 50020;
I suggested hive because HQL(SQL-like) is familiar to most people, and the
learning curve is smooth;


2012/12/4 Haitao Yao <ya...@gmail.com>

> The firewall is OK.
> Well, personally I prefer Pig. And it's a big project, switching pig to
> hive is not an easy way.
> thanks.
>
> Haitao Yao
> yao.erix@gmail.com
> weibo: @haitao_yao
> Skype:  haitao.yao.final
>
> On 2012-12-4, at 下午3:14, panfei <cn...@gmail.com> wrote:
>
> check your firewall settings plz.  and why not use hive to do work ?
>
>
> 2012/12/4 Haitao Yao <ya...@gmail.com>
>
>> hi, all
>> I's using Hadoop 1.2.0 , java version "1.7.0_05"
>>  When running my pig script ,  the worker always report this error, and
>> the MR jobs run very slow.
>> Increase the dfs.socket.timeout value does not work. the network is ok,
>> telnet to 50020 port is always ok.
>>  here's the stacktrace:
>>
>> 2012-12-04 14:29:41,323 INFO org.apache.hadoop.hdfs.DFSClient: Failed to read blk_-2337696885631113108_11054058 on local machinejava.net.SocketTimeoutException: Call to /10.130.110.80:50020 failed on socket timeout exception: java.net.SocketTimeoutException: 10000 millis timeout while waiting for channel to be ready for read. ch : java.nio.channels.SocketChannel[connected local=/10.130.110.80:57689 remote=/10.130.110.80:50020]
>> 	at org.apache.hadoop.ipc.Client.wrapException(Client.java:1140)
>> 	at org.apache.hadoop.ipc.Client.call(Client.java:1112)
>> 	at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:229)
>> 	at $Proxy3.getProtocolVersion(Unknown Source)
>> 	at org.apache.hadoop.ipc.RPC.getProxy(RPC.java:411)
>> 	at org.apache.hadoop.ipc.RPC.getProxy(RPC.java:392)
>> 	at org.apache.hadoop.ipc.RPC.getProxy(RPC.java:374)
>> 	at org.apache.hadoop.hdfs.DFSClient.createClientDatanodeProtocolProxy(DFSClient.java:212)
>> 	at org.apache.hadoop.hdfs.BlockReaderLocal$LocalDatanodeInfo.getDatanodeProxy(BlockReaderLocal.java:90)
>> 	at org.apache.hadoop.hdfs.BlockReaderLocal$LocalDatanodeInfo.access$200(BlockReaderLocal.java:65)
>> 	at org.apache.hadoop.hdfs.BlockReaderLocal.getBlockPathInfo(BlockReaderLocal.java:224)
>> 	at org.apache.hadoop.hdfs.BlockReaderLocal.newBlockReader(BlockReaderLocal.java:145)
>> 	at org.apache.hadoop.hdfs.DFSClient.getLocalBlockReader(DFSClient.java:509)
>> 	at org.apache.hadoop.hdfs.DFSClient.access$800(DFSClient.java:78)
>> 	at org.apache.hadoop.hdfs.DFSClient$DFSInputStream.blockSeekTo(DFSClient.java:2231)
>> 	at org.apache.hadoop.hdfs.DFSClient$DFSInputStream.read(DFSClient.java:2384)
>> 	at java.io.DataInputStream.read(DataInputStream.java:149)
>> 	at java.io.BufferedInputStream.fill(BufferedInputStream.java:235)
>> 	at java.io.BufferedInputStream.read(BufferedInputStream.java:254)
>> 	at org.apache.pig.impl.io.BufferedPositionedInputStream.read(BufferedPositionedInputStream.java:52)
>> 	at org.apache.pig.impl.io.InterRecordReader.nextKeyValue(InterRecordReader.java:86)
>> 	at org.apache.pig.impl.io.InterStorage.getNext(InterStorage.java:77)
>> 	at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigRecordReader.nextKeyValue(PigRecordReader.java:187)
>> 	at org.apache.hadoop.mapred.MapTask$NewTrackingRecordReader.nextKeyValue(MapTask.java:533)
>> 	at org.apache.hadoop.mapreduce.MapContext.nextKeyValue(MapContext.java:67)
>> 	at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:143)
>> 	at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:766)
>> 	at org.apache.hadoop.mapred.MapTask.run(MapTask.java:370)
>> 	at org.apache.hadoop.mapred.Child$4.run(Child.java:255)
>> 	at java.security.AccessController.doPrivileged(Native Method)
>> 	at javax.security.auth.Subject.doAs(Subject.java:415)
>> 	at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1136)
>> 	at org.apache.hadoop.mapred.Child.main(Child.java:249)
>> Caused by: java.net.SocketTimeoutException: 10000 millis timeout while waiting for channel to be ready for read. ch : java.nio.channels.SocketChannel[connected local=/10.130.110.80:57689 remote=/10.130.110.80:50020]
>> 	at org.apache.hadoop.net.SocketIOWithTimeout.doIO(SocketIOWithTimeout.java:164)
>> 	at org.apache.hadoop.net.SocketInputStream.read(SocketInputStream.java:155)
>> 	at org.apache.hadoop.net.SocketInputStream.read(SocketInputStream.java:128)
>> 	at java.io.FilterInputStream.read(FilterInputStream.java:133)
>> 	at org.apache.hadoop.ipc.Client$Connection$PingInputStream.read(Client.java:361)
>> 	at java.io.BufferedInputStream.fill(BufferedInputStream.java:235)
>> 	at java.io.BufferedInputStream.read(BufferedInputStream.java:254)
>> 	at java.io.DataInputStream.readInt(DataInputStream.java:387)
>> 	at org.apache.hadoop.ipc.Client$Connection.receiveResponse(Client.java:841)
>> 	at org.apache.hadoop.ipc.Client$Connection.run(Client.java:786)
>>
>>
>> I checked the source code, the exception happens here:
>>       //now wait for socket to be ready.
>>       *int* count = 0;
>>       *try* {
>>         count = *selector*.select(channel, ops, timeout);
>>       } *catch* (IOException e) { //unexpected IOException.
>>         closed = *true*;
>>         *throw* e;
>>       }
>>
>>       *if* (count == 0) {
>> //here!!        *throw* *new* SocketTimeoutException(*
>> timeoutExceptionString*(channel,
>>                                                                 timeout,
>> ops));
>>       }
>>
>> Why the selector selected nothing? the data node is not under heavy load
>> , gc, network are all ok.
>>  Thanks.
>>
>>   Haitao Yao
>> yao.erix@gmail.com
>> weibo: @haitao_yao
>> Skype:  haitao.yao.final
>>
>>
>
>
> --
> 不学习,不知道
>
>
>


-- 
不学习,不知道

Re: Socket timeout for BlockReaderLocal

Posted by Haitao Yao <ya...@gmail.com>.
The firewall is OK.  
Well, personally I prefer Pig. And it's a big project, switching pig to hive is not an easy way.
thanks.

Haitao Yao
yao.erix@gmail.com
weibo: @haitao_yao
Skype:  haitao.yao.final

On 2012-12-4, at 下午3:14, panfei <cn...@gmail.com> wrote:

> check your firewall settings plz.  and why not use hive to do work ?
> 
> 
> 2012/12/4 Haitao Yao <ya...@gmail.com>
> hi, all
> 	I's using Hadoop 1.2.0 , java version "1.7.0_05"
> 	When running my pig script ,  the worker always report this error, and the MR jobs run very slow. 
> 	Increase the dfs.socket.timeout value does not work. the network is ok, telnet to 50020 port is always ok.
> 	here's the stacktrace: 
> 2012-12-04 14:29:41,323 INFO org.apache.hadoop.hdfs.DFSClient: Failed to read blk_-2337696885631113108_11054058 on local machinejava.net.SocketTimeoutException: Call to /10.130.110.80:50020 failed on socket timeout exception: java.net.SocketTimeoutException: 10000 millis timeout while waiting for channel to be ready for read. ch : java.nio.channels.SocketChannel[connected local=/10.130.110.80:57689 remote=/10.130.110.80:50020]
> 	at org.apache.hadoop.ipc.Client.wrapException(Client.java:1140)
> 	at org.apache.hadoop.ipc.Client.call(Client.java:1112)
> 	at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:229)
> 	at $Proxy3.getProtocolVersion(Unknown Source)
> 	at org.apache.hadoop.ipc.RPC.getProxy(RPC.java:411)
> 	at org.apache.hadoop.ipc.RPC.getProxy(RPC.java:392)
> 	at org.apache.hadoop.ipc.RPC.getProxy(RPC.java:374)
> 	at org.apache.hadoop.hdfs.DFSClient.createClientDatanodeProtocolProxy(DFSClient.java:212)
> 	at org.apache.hadoop.hdfs.BlockReaderLocal$LocalDatanodeInfo.getDatanodeProxy(BlockReaderLocal.java:90)
> 	at org.apache.hadoop.hdfs.BlockReaderLocal$LocalDatanodeInfo.access$200(BlockReaderLocal.java:65)
> 	at org.apache.hadoop.hdfs.BlockReaderLocal.getBlockPathInfo(BlockReaderLocal.java:224)
> 	at org.apache.hadoop.hdfs.BlockReaderLocal.newBlockReader(BlockReaderLocal.java:145)
> 	at org.apache.hadoop.hdfs.DFSClient.getLocalBlockReader(DFSClient.java:509)
> 	at org.apache.hadoop.hdfs.DFSClient.access$800(DFSClient.java:78)
> 	at org.apache.hadoop.hdfs.DFSClient$DFSInputStream.blockSeekTo(DFSClient.java:2231)
> 	at org.apache.hadoop.hdfs.DFSClient$DFSInputStream.read(DFSClient.java:2384)
> 	at java.io.DataInputStream.read(DataInputStream.java:149)
> 	at java.io.BufferedInputStream.fill(BufferedInputStream.java:235)
> 	at java.io.BufferedInputStream.read(BufferedInputStream.java:254)
> 	at org.apache.pig.impl.io.BufferedPositionedInputStream.read(BufferedPositionedInputStream.java:52)
> 	at org.apache.pig.impl.io.InterRecordReader.nextKeyValue(InterRecordReader.java:86)
> 	at org.apache.pig.impl.io.InterStorage.getNext(InterStorage.java:77)
> 	at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigRecordReader.nextKeyValue(PigRecordReader.java:187)
> 	at org.apache.hadoop.mapred.MapTask$NewTrackingRecordReader.nextKeyValue(MapTask.java:533)
> 	at org.apache.hadoop.mapreduce.MapContext.nextKeyValue(MapContext.java:67)
> 	at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:143)
> 	at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:766)
> 	at org.apache.hadoop.mapred.MapTask.run(MapTask.java:370)
> 	at org.apache.hadoop.mapred.Child$4.run(Child.java:255)
> 	at java.security.AccessController.doPrivileged(Native Method)
> 	at javax.security.auth.Subject.doAs(Subject.java:415)
> 	at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1136)
> 	at org.apache.hadoop.mapred.Child.main(Child.java:249)
> Caused by: java.net.SocketTimeoutException: 10000 millis timeout while waiting for channel to be ready for read. ch : java.nio.channels.SocketChannel[connected local=/10.130.110.80:57689 remote=/10.130.110.80:50020]
> 	at org.apache.hadoop.net.SocketIOWithTimeout.doIO(SocketIOWithTimeout.java:164)
> 	at org.apache.hadoop.net.SocketInputStream.read(SocketInputStream.java:155)
> 	at org.apache.hadoop.net.SocketInputStream.read(SocketInputStream.java:128)
> 	at java.io.FilterInputStream.read(FilterInputStream.java:133)
> 	at org.apache.hadoop.ipc.Client$Connection$PingInputStream.read(Client.java:361)
> 	at java.io.BufferedInputStream.fill(BufferedInputStream.java:235)
> 	at java.io.BufferedInputStream.read(BufferedInputStream.java:254)
> 	at java.io.DataInputStream.readInt(DataInputStream.java:387)
> 	at org.apache.hadoop.ipc.Client$Connection.receiveResponse(Client.java:841)
> 	at org.apache.hadoop.ipc.Client$Connection.run(Client.java:786)
> 
> I checked the source code, the exception happens here:
>       //now wait for socket to be ready.
>       int count = 0;
>       try {
>         count = selector.select(channel, ops, timeout);  
>       } catch (IOException e) { //unexpected IOException.
>         closed = true;
>         throw e;
>       } 
> 
>       if (count == 0) {
> //here!!        throw new SocketTimeoutException(timeoutExceptionString(channel,
>                                                                 timeout, ops));
>       }
> 
> 	Why the selector selected nothing? the data node is not under heavy load , gc, network are all ok.
> 	
> 	Thanks.
> 	
> 
> Haitao Yao
> yao.erix@gmail.com
> weibo: @haitao_yao
> Skype:  haitao.yao.final
> 
> 
> 
> 
> -- 
> 不学习,不知道
> 


Re: Socket timeout for BlockReaderLocal

Posted by Haitao Yao <ya...@gmail.com>.
The firewall is OK.  
Well, personally I prefer Pig. And it's a big project, switching pig to hive is not an easy way.
thanks.

Haitao Yao
yao.erix@gmail.com
weibo: @haitao_yao
Skype:  haitao.yao.final

On 2012-12-4, at 下午3:14, panfei <cn...@gmail.com> wrote:

> check your firewall settings plz.  and why not use hive to do work ?
> 
> 
> 2012/12/4 Haitao Yao <ya...@gmail.com>
> hi, all
> 	I's using Hadoop 1.2.0 , java version "1.7.0_05"
> 	When running my pig script ,  the worker always report this error, and the MR jobs run very slow. 
> 	Increase the dfs.socket.timeout value does not work. the network is ok, telnet to 50020 port is always ok.
> 	here's the stacktrace: 
> 2012-12-04 14:29:41,323 INFO org.apache.hadoop.hdfs.DFSClient: Failed to read blk_-2337696885631113108_11054058 on local machinejava.net.SocketTimeoutException: Call to /10.130.110.80:50020 failed on socket timeout exception: java.net.SocketTimeoutException: 10000 millis timeout while waiting for channel to be ready for read. ch : java.nio.channels.SocketChannel[connected local=/10.130.110.80:57689 remote=/10.130.110.80:50020]
> 	at org.apache.hadoop.ipc.Client.wrapException(Client.java:1140)
> 	at org.apache.hadoop.ipc.Client.call(Client.java:1112)
> 	at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:229)
> 	at $Proxy3.getProtocolVersion(Unknown Source)
> 	at org.apache.hadoop.ipc.RPC.getProxy(RPC.java:411)
> 	at org.apache.hadoop.ipc.RPC.getProxy(RPC.java:392)
> 	at org.apache.hadoop.ipc.RPC.getProxy(RPC.java:374)
> 	at org.apache.hadoop.hdfs.DFSClient.createClientDatanodeProtocolProxy(DFSClient.java:212)
> 	at org.apache.hadoop.hdfs.BlockReaderLocal$LocalDatanodeInfo.getDatanodeProxy(BlockReaderLocal.java:90)
> 	at org.apache.hadoop.hdfs.BlockReaderLocal$LocalDatanodeInfo.access$200(BlockReaderLocal.java:65)
> 	at org.apache.hadoop.hdfs.BlockReaderLocal.getBlockPathInfo(BlockReaderLocal.java:224)
> 	at org.apache.hadoop.hdfs.BlockReaderLocal.newBlockReader(BlockReaderLocal.java:145)
> 	at org.apache.hadoop.hdfs.DFSClient.getLocalBlockReader(DFSClient.java:509)
> 	at org.apache.hadoop.hdfs.DFSClient.access$800(DFSClient.java:78)
> 	at org.apache.hadoop.hdfs.DFSClient$DFSInputStream.blockSeekTo(DFSClient.java:2231)
> 	at org.apache.hadoop.hdfs.DFSClient$DFSInputStream.read(DFSClient.java:2384)
> 	at java.io.DataInputStream.read(DataInputStream.java:149)
> 	at java.io.BufferedInputStream.fill(BufferedInputStream.java:235)
> 	at java.io.BufferedInputStream.read(BufferedInputStream.java:254)
> 	at org.apache.pig.impl.io.BufferedPositionedInputStream.read(BufferedPositionedInputStream.java:52)
> 	at org.apache.pig.impl.io.InterRecordReader.nextKeyValue(InterRecordReader.java:86)
> 	at org.apache.pig.impl.io.InterStorage.getNext(InterStorage.java:77)
> 	at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigRecordReader.nextKeyValue(PigRecordReader.java:187)
> 	at org.apache.hadoop.mapred.MapTask$NewTrackingRecordReader.nextKeyValue(MapTask.java:533)
> 	at org.apache.hadoop.mapreduce.MapContext.nextKeyValue(MapContext.java:67)
> 	at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:143)
> 	at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:766)
> 	at org.apache.hadoop.mapred.MapTask.run(MapTask.java:370)
> 	at org.apache.hadoop.mapred.Child$4.run(Child.java:255)
> 	at java.security.AccessController.doPrivileged(Native Method)
> 	at javax.security.auth.Subject.doAs(Subject.java:415)
> 	at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1136)
> 	at org.apache.hadoop.mapred.Child.main(Child.java:249)
> Caused by: java.net.SocketTimeoutException: 10000 millis timeout while waiting for channel to be ready for read. ch : java.nio.channels.SocketChannel[connected local=/10.130.110.80:57689 remote=/10.130.110.80:50020]
> 	at org.apache.hadoop.net.SocketIOWithTimeout.doIO(SocketIOWithTimeout.java:164)
> 	at org.apache.hadoop.net.SocketInputStream.read(SocketInputStream.java:155)
> 	at org.apache.hadoop.net.SocketInputStream.read(SocketInputStream.java:128)
> 	at java.io.FilterInputStream.read(FilterInputStream.java:133)
> 	at org.apache.hadoop.ipc.Client$Connection$PingInputStream.read(Client.java:361)
> 	at java.io.BufferedInputStream.fill(BufferedInputStream.java:235)
> 	at java.io.BufferedInputStream.read(BufferedInputStream.java:254)
> 	at java.io.DataInputStream.readInt(DataInputStream.java:387)
> 	at org.apache.hadoop.ipc.Client$Connection.receiveResponse(Client.java:841)
> 	at org.apache.hadoop.ipc.Client$Connection.run(Client.java:786)
> 
> I checked the source code, the exception happens here:
>       //now wait for socket to be ready.
>       int count = 0;
>       try {
>         count = selector.select(channel, ops, timeout);  
>       } catch (IOException e) { //unexpected IOException.
>         closed = true;
>         throw e;
>       } 
> 
>       if (count == 0) {
> //here!!        throw new SocketTimeoutException(timeoutExceptionString(channel,
>                                                                 timeout, ops));
>       }
> 
> 	Why the selector selected nothing? the data node is not under heavy load , gc, network are all ok.
> 	
> 	Thanks.
> 	
> 
> Haitao Yao
> yao.erix@gmail.com
> weibo: @haitao_yao
> Skype:  haitao.yao.final
> 
> 
> 
> 
> -- 
> 不学习,不知道
> 


Re: Socket timeout for BlockReaderLocal

Posted by Haitao Yao <ya...@gmail.com>.
The firewall is OK.  
Well, personally I prefer Pig. And it's a big project, switching pig to hive is not an easy way.
thanks.

Haitao Yao
yao.erix@gmail.com
weibo: @haitao_yao
Skype:  haitao.yao.final

On 2012-12-4, at 下午3:14, panfei <cn...@gmail.com> wrote:

> check your firewall settings plz.  and why not use hive to do work ?
> 
> 
> 2012/12/4 Haitao Yao <ya...@gmail.com>
> hi, all
> 	I's using Hadoop 1.2.0 , java version "1.7.0_05"
> 	When running my pig script ,  the worker always report this error, and the MR jobs run very slow. 
> 	Increase the dfs.socket.timeout value does not work. the network is ok, telnet to 50020 port is always ok.
> 	here's the stacktrace: 
> 2012-12-04 14:29:41,323 INFO org.apache.hadoop.hdfs.DFSClient: Failed to read blk_-2337696885631113108_11054058 on local machinejava.net.SocketTimeoutException: Call to /10.130.110.80:50020 failed on socket timeout exception: java.net.SocketTimeoutException: 10000 millis timeout while waiting for channel to be ready for read. ch : java.nio.channels.SocketChannel[connected local=/10.130.110.80:57689 remote=/10.130.110.80:50020]
> 	at org.apache.hadoop.ipc.Client.wrapException(Client.java:1140)
> 	at org.apache.hadoop.ipc.Client.call(Client.java:1112)
> 	at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:229)
> 	at $Proxy3.getProtocolVersion(Unknown Source)
> 	at org.apache.hadoop.ipc.RPC.getProxy(RPC.java:411)
> 	at org.apache.hadoop.ipc.RPC.getProxy(RPC.java:392)
> 	at org.apache.hadoop.ipc.RPC.getProxy(RPC.java:374)
> 	at org.apache.hadoop.hdfs.DFSClient.createClientDatanodeProtocolProxy(DFSClient.java:212)
> 	at org.apache.hadoop.hdfs.BlockReaderLocal$LocalDatanodeInfo.getDatanodeProxy(BlockReaderLocal.java:90)
> 	at org.apache.hadoop.hdfs.BlockReaderLocal$LocalDatanodeInfo.access$200(BlockReaderLocal.java:65)
> 	at org.apache.hadoop.hdfs.BlockReaderLocal.getBlockPathInfo(BlockReaderLocal.java:224)
> 	at org.apache.hadoop.hdfs.BlockReaderLocal.newBlockReader(BlockReaderLocal.java:145)
> 	at org.apache.hadoop.hdfs.DFSClient.getLocalBlockReader(DFSClient.java:509)
> 	at org.apache.hadoop.hdfs.DFSClient.access$800(DFSClient.java:78)
> 	at org.apache.hadoop.hdfs.DFSClient$DFSInputStream.blockSeekTo(DFSClient.java:2231)
> 	at org.apache.hadoop.hdfs.DFSClient$DFSInputStream.read(DFSClient.java:2384)
> 	at java.io.DataInputStream.read(DataInputStream.java:149)
> 	at java.io.BufferedInputStream.fill(BufferedInputStream.java:235)
> 	at java.io.BufferedInputStream.read(BufferedInputStream.java:254)
> 	at org.apache.pig.impl.io.BufferedPositionedInputStream.read(BufferedPositionedInputStream.java:52)
> 	at org.apache.pig.impl.io.InterRecordReader.nextKeyValue(InterRecordReader.java:86)
> 	at org.apache.pig.impl.io.InterStorage.getNext(InterStorage.java:77)
> 	at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigRecordReader.nextKeyValue(PigRecordReader.java:187)
> 	at org.apache.hadoop.mapred.MapTask$NewTrackingRecordReader.nextKeyValue(MapTask.java:533)
> 	at org.apache.hadoop.mapreduce.MapContext.nextKeyValue(MapContext.java:67)
> 	at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:143)
> 	at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:766)
> 	at org.apache.hadoop.mapred.MapTask.run(MapTask.java:370)
> 	at org.apache.hadoop.mapred.Child$4.run(Child.java:255)
> 	at java.security.AccessController.doPrivileged(Native Method)
> 	at javax.security.auth.Subject.doAs(Subject.java:415)
> 	at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1136)
> 	at org.apache.hadoop.mapred.Child.main(Child.java:249)
> Caused by: java.net.SocketTimeoutException: 10000 millis timeout while waiting for channel to be ready for read. ch : java.nio.channels.SocketChannel[connected local=/10.130.110.80:57689 remote=/10.130.110.80:50020]
> 	at org.apache.hadoop.net.SocketIOWithTimeout.doIO(SocketIOWithTimeout.java:164)
> 	at org.apache.hadoop.net.SocketInputStream.read(SocketInputStream.java:155)
> 	at org.apache.hadoop.net.SocketInputStream.read(SocketInputStream.java:128)
> 	at java.io.FilterInputStream.read(FilterInputStream.java:133)
> 	at org.apache.hadoop.ipc.Client$Connection$PingInputStream.read(Client.java:361)
> 	at java.io.BufferedInputStream.fill(BufferedInputStream.java:235)
> 	at java.io.BufferedInputStream.read(BufferedInputStream.java:254)
> 	at java.io.DataInputStream.readInt(DataInputStream.java:387)
> 	at org.apache.hadoop.ipc.Client$Connection.receiveResponse(Client.java:841)
> 	at org.apache.hadoop.ipc.Client$Connection.run(Client.java:786)
> 
> I checked the source code, the exception happens here:
>       //now wait for socket to be ready.
>       int count = 0;
>       try {
>         count = selector.select(channel, ops, timeout);  
>       } catch (IOException e) { //unexpected IOException.
>         closed = true;
>         throw e;
>       } 
> 
>       if (count == 0) {
> //here!!        throw new SocketTimeoutException(timeoutExceptionString(channel,
>                                                                 timeout, ops));
>       }
> 
> 	Why the selector selected nothing? the data node is not under heavy load , gc, network are all ok.
> 	
> 	Thanks.
> 	
> 
> Haitao Yao
> yao.erix@gmail.com
> weibo: @haitao_yao
> Skype:  haitao.yao.final
> 
> 
> 
> 
> -- 
> 不学习,不知道
> 


Re: Socket timeout for BlockReaderLocal

Posted by Haitao Yao <ya...@gmail.com>.
The firewall is OK.  
Well, personally I prefer Pig. And it's a big project, switching pig to hive is not an easy way.
thanks.

Haitao Yao
yao.erix@gmail.com
weibo: @haitao_yao
Skype:  haitao.yao.final

On 2012-12-4, at 下午3:14, panfei <cn...@gmail.com> wrote:

> check your firewall settings plz.  and why not use hive to do work ?
> 
> 
> 2012/12/4 Haitao Yao <ya...@gmail.com>
> hi, all
> 	I's using Hadoop 1.2.0 , java version "1.7.0_05"
> 	When running my pig script ,  the worker always report this error, and the MR jobs run very slow. 
> 	Increase the dfs.socket.timeout value does not work. the network is ok, telnet to 50020 port is always ok.
> 	here's the stacktrace: 
> 2012-12-04 14:29:41,323 INFO org.apache.hadoop.hdfs.DFSClient: Failed to read blk_-2337696885631113108_11054058 on local machinejava.net.SocketTimeoutException: Call to /10.130.110.80:50020 failed on socket timeout exception: java.net.SocketTimeoutException: 10000 millis timeout while waiting for channel to be ready for read. ch : java.nio.channels.SocketChannel[connected local=/10.130.110.80:57689 remote=/10.130.110.80:50020]
> 	at org.apache.hadoop.ipc.Client.wrapException(Client.java:1140)
> 	at org.apache.hadoop.ipc.Client.call(Client.java:1112)
> 	at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:229)
> 	at $Proxy3.getProtocolVersion(Unknown Source)
> 	at org.apache.hadoop.ipc.RPC.getProxy(RPC.java:411)
> 	at org.apache.hadoop.ipc.RPC.getProxy(RPC.java:392)
> 	at org.apache.hadoop.ipc.RPC.getProxy(RPC.java:374)
> 	at org.apache.hadoop.hdfs.DFSClient.createClientDatanodeProtocolProxy(DFSClient.java:212)
> 	at org.apache.hadoop.hdfs.BlockReaderLocal$LocalDatanodeInfo.getDatanodeProxy(BlockReaderLocal.java:90)
> 	at org.apache.hadoop.hdfs.BlockReaderLocal$LocalDatanodeInfo.access$200(BlockReaderLocal.java:65)
> 	at org.apache.hadoop.hdfs.BlockReaderLocal.getBlockPathInfo(BlockReaderLocal.java:224)
> 	at org.apache.hadoop.hdfs.BlockReaderLocal.newBlockReader(BlockReaderLocal.java:145)
> 	at org.apache.hadoop.hdfs.DFSClient.getLocalBlockReader(DFSClient.java:509)
> 	at org.apache.hadoop.hdfs.DFSClient.access$800(DFSClient.java:78)
> 	at org.apache.hadoop.hdfs.DFSClient$DFSInputStream.blockSeekTo(DFSClient.java:2231)
> 	at org.apache.hadoop.hdfs.DFSClient$DFSInputStream.read(DFSClient.java:2384)
> 	at java.io.DataInputStream.read(DataInputStream.java:149)
> 	at java.io.BufferedInputStream.fill(BufferedInputStream.java:235)
> 	at java.io.BufferedInputStream.read(BufferedInputStream.java:254)
> 	at org.apache.pig.impl.io.BufferedPositionedInputStream.read(BufferedPositionedInputStream.java:52)
> 	at org.apache.pig.impl.io.InterRecordReader.nextKeyValue(InterRecordReader.java:86)
> 	at org.apache.pig.impl.io.InterStorage.getNext(InterStorage.java:77)
> 	at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigRecordReader.nextKeyValue(PigRecordReader.java:187)
> 	at org.apache.hadoop.mapred.MapTask$NewTrackingRecordReader.nextKeyValue(MapTask.java:533)
> 	at org.apache.hadoop.mapreduce.MapContext.nextKeyValue(MapContext.java:67)
> 	at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:143)
> 	at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:766)
> 	at org.apache.hadoop.mapred.MapTask.run(MapTask.java:370)
> 	at org.apache.hadoop.mapred.Child$4.run(Child.java:255)
> 	at java.security.AccessController.doPrivileged(Native Method)
> 	at javax.security.auth.Subject.doAs(Subject.java:415)
> 	at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1136)
> 	at org.apache.hadoop.mapred.Child.main(Child.java:249)
> Caused by: java.net.SocketTimeoutException: 10000 millis timeout while waiting for channel to be ready for read. ch : java.nio.channels.SocketChannel[connected local=/10.130.110.80:57689 remote=/10.130.110.80:50020]
> 	at org.apache.hadoop.net.SocketIOWithTimeout.doIO(SocketIOWithTimeout.java:164)
> 	at org.apache.hadoop.net.SocketInputStream.read(SocketInputStream.java:155)
> 	at org.apache.hadoop.net.SocketInputStream.read(SocketInputStream.java:128)
> 	at java.io.FilterInputStream.read(FilterInputStream.java:133)
> 	at org.apache.hadoop.ipc.Client$Connection$PingInputStream.read(Client.java:361)
> 	at java.io.BufferedInputStream.fill(BufferedInputStream.java:235)
> 	at java.io.BufferedInputStream.read(BufferedInputStream.java:254)
> 	at java.io.DataInputStream.readInt(DataInputStream.java:387)
> 	at org.apache.hadoop.ipc.Client$Connection.receiveResponse(Client.java:841)
> 	at org.apache.hadoop.ipc.Client$Connection.run(Client.java:786)
> 
> I checked the source code, the exception happens here:
>       //now wait for socket to be ready.
>       int count = 0;
>       try {
>         count = selector.select(channel, ops, timeout);  
>       } catch (IOException e) { //unexpected IOException.
>         closed = true;
>         throw e;
>       } 
> 
>       if (count == 0) {
> //here!!        throw new SocketTimeoutException(timeoutExceptionString(channel,
>                                                                 timeout, ops));
>       }
> 
> 	Why the selector selected nothing? the data node is not under heavy load , gc, network are all ok.
> 	
> 	Thanks.
> 	
> 
> Haitao Yao
> yao.erix@gmail.com
> weibo: @haitao_yao
> Skype:  haitao.yao.final
> 
> 
> 
> 
> -- 
> 不学习,不知道
> 


Re: Socket timeout for BlockReaderLocal

Posted by panfei <cn...@gmail.com>.
check your firewall settings plz.  and why not use hive to do work ?


2012/12/4 Haitao Yao <ya...@gmail.com>

> hi, all
> I's using Hadoop 1.2.0 , java version "1.7.0_05"
> When running my pig script ,  the worker always report this error, and the
> MR jobs run very slow.
> Increase the dfs.socket.timeout value does not work. the network is ok,
> telnet to 50020 port is always ok.
> here's the stacktrace:
>
> 2012-12-04 14:29:41,323 INFO org.apache.hadoop.hdfs.DFSClient: Failed to read blk_-2337696885631113108_11054058 on local machinejava.net.SocketTimeoutException: Call to /10.130.110.80:50020 failed on socket timeout exception: java.net.SocketTimeoutException: 10000 millis timeout while waiting for channel to be ready for read. ch : java.nio.channels.SocketChannel[connected local=/10.130.110.80:57689 remote=/10.130.110.80:50020]
> 	at org.apache.hadoop.ipc.Client.wrapException(Client.java:1140)
> 	at org.apache.hadoop.ipc.Client.call(Client.java:1112)
> 	at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:229)
> 	at $Proxy3.getProtocolVersion(Unknown Source)
> 	at org.apache.hadoop.ipc.RPC.getProxy(RPC.java:411)
> 	at org.apache.hadoop.ipc.RPC.getProxy(RPC.java:392)
> 	at org.apache.hadoop.ipc.RPC.getProxy(RPC.java:374)
> 	at org.apache.hadoop.hdfs.DFSClient.createClientDatanodeProtocolProxy(DFSClient.java:212)
> 	at org.apache.hadoop.hdfs.BlockReaderLocal$LocalDatanodeInfo.getDatanodeProxy(BlockReaderLocal.java:90)
> 	at org.apache.hadoop.hdfs.BlockReaderLocal$LocalDatanodeInfo.access$200(BlockReaderLocal.java:65)
> 	at org.apache.hadoop.hdfs.BlockReaderLocal.getBlockPathInfo(BlockReaderLocal.java:224)
> 	at org.apache.hadoop.hdfs.BlockReaderLocal.newBlockReader(BlockReaderLocal.java:145)
> 	at org.apache.hadoop.hdfs.DFSClient.getLocalBlockReader(DFSClient.java:509)
> 	at org.apache.hadoop.hdfs.DFSClient.access$800(DFSClient.java:78)
> 	at org.apache.hadoop.hdfs.DFSClient$DFSInputStream.blockSeekTo(DFSClient.java:2231)
> 	at org.apache.hadoop.hdfs.DFSClient$DFSInputStream.read(DFSClient.java:2384)
> 	at java.io.DataInputStream.read(DataInputStream.java:149)
> 	at java.io.BufferedInputStream.fill(BufferedInputStream.java:235)
> 	at java.io.BufferedInputStream.read(BufferedInputStream.java:254)
> 	at org.apache.pig.impl.io.BufferedPositionedInputStream.read(BufferedPositionedInputStream.java:52)
> 	at org.apache.pig.impl.io.InterRecordReader.nextKeyValue(InterRecordReader.java:86)
> 	at org.apache.pig.impl.io.InterStorage.getNext(InterStorage.java:77)
> 	at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigRecordReader.nextKeyValue(PigRecordReader.java:187)
> 	at org.apache.hadoop.mapred.MapTask$NewTrackingRecordReader.nextKeyValue(MapTask.java:533)
> 	at org.apache.hadoop.mapreduce.MapContext.nextKeyValue(MapContext.java:67)
> 	at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:143)
> 	at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:766)
> 	at org.apache.hadoop.mapred.MapTask.run(MapTask.java:370)
> 	at org.apache.hadoop.mapred.Child$4.run(Child.java:255)
> 	at java.security.AccessController.doPrivileged(Native Method)
> 	at javax.security.auth.Subject.doAs(Subject.java:415)
> 	at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1136)
> 	at org.apache.hadoop.mapred.Child.main(Child.java:249)
> Caused by: java.net.SocketTimeoutException: 10000 millis timeout while waiting for channel to be ready for read. ch : java.nio.channels.SocketChannel[connected local=/10.130.110.80:57689 remote=/10.130.110.80:50020]
> 	at org.apache.hadoop.net.SocketIOWithTimeout.doIO(SocketIOWithTimeout.java:164)
> 	at org.apache.hadoop.net.SocketInputStream.read(SocketInputStream.java:155)
> 	at org.apache.hadoop.net.SocketInputStream.read(SocketInputStream.java:128)
> 	at java.io.FilterInputStream.read(FilterInputStream.java:133)
> 	at org.apache.hadoop.ipc.Client$Connection$PingInputStream.read(Client.java:361)
> 	at java.io.BufferedInputStream.fill(BufferedInputStream.java:235)
> 	at java.io.BufferedInputStream.read(BufferedInputStream.java:254)
> 	at java.io.DataInputStream.readInt(DataInputStream.java:387)
> 	at org.apache.hadoop.ipc.Client$Connection.receiveResponse(Client.java:841)
> 	at org.apache.hadoop.ipc.Client$Connection.run(Client.java:786)
>
>
> I checked the source code, the exception happens here:
>       //now wait for socket to be ready.
>       *int* count = 0;
>       *try* {
>         count = *selector*.select(channel, ops, timeout);
>       } *catch* (IOException e) { //unexpected IOException.
>         closed = *true*;
>         *throw* e;
>       }
>
>       *if* (count == 0) {
> //here!!        *throw* *new* SocketTimeoutException(*
> timeoutExceptionString*(channel,
>                                                                 timeout,
> ops));
>       }
>
> Why the selector selected nothing? the data node is not under heavy load ,
> gc, network are all ok.
>  Thanks.
>
> Haitao Yao
> yao.erix@gmail.com
> weibo: @haitao_yao
> Skype:  haitao.yao.final
>
>


-- 
不学习,不知道

Re: Socket timeout for BlockReaderLocal

Posted by panfei <cn...@gmail.com>.
check your firewall settings plz.  and why not use hive to do work ?


2012/12/4 Haitao Yao <ya...@gmail.com>

> hi, all
> I's using Hadoop 1.2.0 , java version "1.7.0_05"
> When running my pig script ,  the worker always report this error, and the
> MR jobs run very slow.
> Increase the dfs.socket.timeout value does not work. the network is ok,
> telnet to 50020 port is always ok.
> here's the stacktrace:
>
> 2012-12-04 14:29:41,323 INFO org.apache.hadoop.hdfs.DFSClient: Failed to read blk_-2337696885631113108_11054058 on local machinejava.net.SocketTimeoutException: Call to /10.130.110.80:50020 failed on socket timeout exception: java.net.SocketTimeoutException: 10000 millis timeout while waiting for channel to be ready for read. ch : java.nio.channels.SocketChannel[connected local=/10.130.110.80:57689 remote=/10.130.110.80:50020]
> 	at org.apache.hadoop.ipc.Client.wrapException(Client.java:1140)
> 	at org.apache.hadoop.ipc.Client.call(Client.java:1112)
> 	at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:229)
> 	at $Proxy3.getProtocolVersion(Unknown Source)
> 	at org.apache.hadoop.ipc.RPC.getProxy(RPC.java:411)
> 	at org.apache.hadoop.ipc.RPC.getProxy(RPC.java:392)
> 	at org.apache.hadoop.ipc.RPC.getProxy(RPC.java:374)
> 	at org.apache.hadoop.hdfs.DFSClient.createClientDatanodeProtocolProxy(DFSClient.java:212)
> 	at org.apache.hadoop.hdfs.BlockReaderLocal$LocalDatanodeInfo.getDatanodeProxy(BlockReaderLocal.java:90)
> 	at org.apache.hadoop.hdfs.BlockReaderLocal$LocalDatanodeInfo.access$200(BlockReaderLocal.java:65)
> 	at org.apache.hadoop.hdfs.BlockReaderLocal.getBlockPathInfo(BlockReaderLocal.java:224)
> 	at org.apache.hadoop.hdfs.BlockReaderLocal.newBlockReader(BlockReaderLocal.java:145)
> 	at org.apache.hadoop.hdfs.DFSClient.getLocalBlockReader(DFSClient.java:509)
> 	at org.apache.hadoop.hdfs.DFSClient.access$800(DFSClient.java:78)
> 	at org.apache.hadoop.hdfs.DFSClient$DFSInputStream.blockSeekTo(DFSClient.java:2231)
> 	at org.apache.hadoop.hdfs.DFSClient$DFSInputStream.read(DFSClient.java:2384)
> 	at java.io.DataInputStream.read(DataInputStream.java:149)
> 	at java.io.BufferedInputStream.fill(BufferedInputStream.java:235)
> 	at java.io.BufferedInputStream.read(BufferedInputStream.java:254)
> 	at org.apache.pig.impl.io.BufferedPositionedInputStream.read(BufferedPositionedInputStream.java:52)
> 	at org.apache.pig.impl.io.InterRecordReader.nextKeyValue(InterRecordReader.java:86)
> 	at org.apache.pig.impl.io.InterStorage.getNext(InterStorage.java:77)
> 	at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigRecordReader.nextKeyValue(PigRecordReader.java:187)
> 	at org.apache.hadoop.mapred.MapTask$NewTrackingRecordReader.nextKeyValue(MapTask.java:533)
> 	at org.apache.hadoop.mapreduce.MapContext.nextKeyValue(MapContext.java:67)
> 	at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:143)
> 	at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:766)
> 	at org.apache.hadoop.mapred.MapTask.run(MapTask.java:370)
> 	at org.apache.hadoop.mapred.Child$4.run(Child.java:255)
> 	at java.security.AccessController.doPrivileged(Native Method)
> 	at javax.security.auth.Subject.doAs(Subject.java:415)
> 	at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1136)
> 	at org.apache.hadoop.mapred.Child.main(Child.java:249)
> Caused by: java.net.SocketTimeoutException: 10000 millis timeout while waiting for channel to be ready for read. ch : java.nio.channels.SocketChannel[connected local=/10.130.110.80:57689 remote=/10.130.110.80:50020]
> 	at org.apache.hadoop.net.SocketIOWithTimeout.doIO(SocketIOWithTimeout.java:164)
> 	at org.apache.hadoop.net.SocketInputStream.read(SocketInputStream.java:155)
> 	at org.apache.hadoop.net.SocketInputStream.read(SocketInputStream.java:128)
> 	at java.io.FilterInputStream.read(FilterInputStream.java:133)
> 	at org.apache.hadoop.ipc.Client$Connection$PingInputStream.read(Client.java:361)
> 	at java.io.BufferedInputStream.fill(BufferedInputStream.java:235)
> 	at java.io.BufferedInputStream.read(BufferedInputStream.java:254)
> 	at java.io.DataInputStream.readInt(DataInputStream.java:387)
> 	at org.apache.hadoop.ipc.Client$Connection.receiveResponse(Client.java:841)
> 	at org.apache.hadoop.ipc.Client$Connection.run(Client.java:786)
>
>
> I checked the source code, the exception happens here:
>       //now wait for socket to be ready.
>       *int* count = 0;
>       *try* {
>         count = *selector*.select(channel, ops, timeout);
>       } *catch* (IOException e) { //unexpected IOException.
>         closed = *true*;
>         *throw* e;
>       }
>
>       *if* (count == 0) {
> //here!!        *throw* *new* SocketTimeoutException(*
> timeoutExceptionString*(channel,
>                                                                 timeout,
> ops));
>       }
>
> Why the selector selected nothing? the data node is not under heavy load ,
> gc, network are all ok.
>  Thanks.
>
> Haitao Yao
> yao.erix@gmail.com
> weibo: @haitao_yao
> Skype:  haitao.yao.final
>
>


-- 
不学习,不知道

Re: Socket timeout for BlockReaderLocal

Posted by panfei <cn...@gmail.com>.
check your firewall settings plz.  and why not use hive to do work ?


2012/12/4 Haitao Yao <ya...@gmail.com>

> hi, all
> I's using Hadoop 1.2.0 , java version "1.7.0_05"
> When running my pig script ,  the worker always report this error, and the
> MR jobs run very slow.
> Increase the dfs.socket.timeout value does not work. the network is ok,
> telnet to 50020 port is always ok.
> here's the stacktrace:
>
> 2012-12-04 14:29:41,323 INFO org.apache.hadoop.hdfs.DFSClient: Failed to read blk_-2337696885631113108_11054058 on local machinejava.net.SocketTimeoutException: Call to /10.130.110.80:50020 failed on socket timeout exception: java.net.SocketTimeoutException: 10000 millis timeout while waiting for channel to be ready for read. ch : java.nio.channels.SocketChannel[connected local=/10.130.110.80:57689 remote=/10.130.110.80:50020]
> 	at org.apache.hadoop.ipc.Client.wrapException(Client.java:1140)
> 	at org.apache.hadoop.ipc.Client.call(Client.java:1112)
> 	at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:229)
> 	at $Proxy3.getProtocolVersion(Unknown Source)
> 	at org.apache.hadoop.ipc.RPC.getProxy(RPC.java:411)
> 	at org.apache.hadoop.ipc.RPC.getProxy(RPC.java:392)
> 	at org.apache.hadoop.ipc.RPC.getProxy(RPC.java:374)
> 	at org.apache.hadoop.hdfs.DFSClient.createClientDatanodeProtocolProxy(DFSClient.java:212)
> 	at org.apache.hadoop.hdfs.BlockReaderLocal$LocalDatanodeInfo.getDatanodeProxy(BlockReaderLocal.java:90)
> 	at org.apache.hadoop.hdfs.BlockReaderLocal$LocalDatanodeInfo.access$200(BlockReaderLocal.java:65)
> 	at org.apache.hadoop.hdfs.BlockReaderLocal.getBlockPathInfo(BlockReaderLocal.java:224)
> 	at org.apache.hadoop.hdfs.BlockReaderLocal.newBlockReader(BlockReaderLocal.java:145)
> 	at org.apache.hadoop.hdfs.DFSClient.getLocalBlockReader(DFSClient.java:509)
> 	at org.apache.hadoop.hdfs.DFSClient.access$800(DFSClient.java:78)
> 	at org.apache.hadoop.hdfs.DFSClient$DFSInputStream.blockSeekTo(DFSClient.java:2231)
> 	at org.apache.hadoop.hdfs.DFSClient$DFSInputStream.read(DFSClient.java:2384)
> 	at java.io.DataInputStream.read(DataInputStream.java:149)
> 	at java.io.BufferedInputStream.fill(BufferedInputStream.java:235)
> 	at java.io.BufferedInputStream.read(BufferedInputStream.java:254)
> 	at org.apache.pig.impl.io.BufferedPositionedInputStream.read(BufferedPositionedInputStream.java:52)
> 	at org.apache.pig.impl.io.InterRecordReader.nextKeyValue(InterRecordReader.java:86)
> 	at org.apache.pig.impl.io.InterStorage.getNext(InterStorage.java:77)
> 	at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigRecordReader.nextKeyValue(PigRecordReader.java:187)
> 	at org.apache.hadoop.mapred.MapTask$NewTrackingRecordReader.nextKeyValue(MapTask.java:533)
> 	at org.apache.hadoop.mapreduce.MapContext.nextKeyValue(MapContext.java:67)
> 	at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:143)
> 	at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:766)
> 	at org.apache.hadoop.mapred.MapTask.run(MapTask.java:370)
> 	at org.apache.hadoop.mapred.Child$4.run(Child.java:255)
> 	at java.security.AccessController.doPrivileged(Native Method)
> 	at javax.security.auth.Subject.doAs(Subject.java:415)
> 	at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1136)
> 	at org.apache.hadoop.mapred.Child.main(Child.java:249)
> Caused by: java.net.SocketTimeoutException: 10000 millis timeout while waiting for channel to be ready for read. ch : java.nio.channels.SocketChannel[connected local=/10.130.110.80:57689 remote=/10.130.110.80:50020]
> 	at org.apache.hadoop.net.SocketIOWithTimeout.doIO(SocketIOWithTimeout.java:164)
> 	at org.apache.hadoop.net.SocketInputStream.read(SocketInputStream.java:155)
> 	at org.apache.hadoop.net.SocketInputStream.read(SocketInputStream.java:128)
> 	at java.io.FilterInputStream.read(FilterInputStream.java:133)
> 	at org.apache.hadoop.ipc.Client$Connection$PingInputStream.read(Client.java:361)
> 	at java.io.BufferedInputStream.fill(BufferedInputStream.java:235)
> 	at java.io.BufferedInputStream.read(BufferedInputStream.java:254)
> 	at java.io.DataInputStream.readInt(DataInputStream.java:387)
> 	at org.apache.hadoop.ipc.Client$Connection.receiveResponse(Client.java:841)
> 	at org.apache.hadoop.ipc.Client$Connection.run(Client.java:786)
>
>
> I checked the source code, the exception happens here:
>       //now wait for socket to be ready.
>       *int* count = 0;
>       *try* {
>         count = *selector*.select(channel, ops, timeout);
>       } *catch* (IOException e) { //unexpected IOException.
>         closed = *true*;
>         *throw* e;
>       }
>
>       *if* (count == 0) {
> //here!!        *throw* *new* SocketTimeoutException(*
> timeoutExceptionString*(channel,
>                                                                 timeout,
> ops));
>       }
>
> Why the selector selected nothing? the data node is not under heavy load ,
> gc, network are all ok.
>  Thanks.
>
> Haitao Yao
> yao.erix@gmail.com
> weibo: @haitao_yao
> Skype:  haitao.yao.final
>
>


-- 
不学习,不知道

Re: Socket timeout for BlockReaderLocal

Posted by panfei <cn...@gmail.com>.
check your firewall settings plz.  and why not use hive to do work ?


2012/12/4 Haitao Yao <ya...@gmail.com>

> hi, all
> I's using Hadoop 1.2.0 , java version "1.7.0_05"
> When running my pig script ,  the worker always report this error, and the
> MR jobs run very slow.
> Increase the dfs.socket.timeout value does not work. the network is ok,
> telnet to 50020 port is always ok.
> here's the stacktrace:
>
> 2012-12-04 14:29:41,323 INFO org.apache.hadoop.hdfs.DFSClient: Failed to read blk_-2337696885631113108_11054058 on local machinejava.net.SocketTimeoutException: Call to /10.130.110.80:50020 failed on socket timeout exception: java.net.SocketTimeoutException: 10000 millis timeout while waiting for channel to be ready for read. ch : java.nio.channels.SocketChannel[connected local=/10.130.110.80:57689 remote=/10.130.110.80:50020]
> 	at org.apache.hadoop.ipc.Client.wrapException(Client.java:1140)
> 	at org.apache.hadoop.ipc.Client.call(Client.java:1112)
> 	at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:229)
> 	at $Proxy3.getProtocolVersion(Unknown Source)
> 	at org.apache.hadoop.ipc.RPC.getProxy(RPC.java:411)
> 	at org.apache.hadoop.ipc.RPC.getProxy(RPC.java:392)
> 	at org.apache.hadoop.ipc.RPC.getProxy(RPC.java:374)
> 	at org.apache.hadoop.hdfs.DFSClient.createClientDatanodeProtocolProxy(DFSClient.java:212)
> 	at org.apache.hadoop.hdfs.BlockReaderLocal$LocalDatanodeInfo.getDatanodeProxy(BlockReaderLocal.java:90)
> 	at org.apache.hadoop.hdfs.BlockReaderLocal$LocalDatanodeInfo.access$200(BlockReaderLocal.java:65)
> 	at org.apache.hadoop.hdfs.BlockReaderLocal.getBlockPathInfo(BlockReaderLocal.java:224)
> 	at org.apache.hadoop.hdfs.BlockReaderLocal.newBlockReader(BlockReaderLocal.java:145)
> 	at org.apache.hadoop.hdfs.DFSClient.getLocalBlockReader(DFSClient.java:509)
> 	at org.apache.hadoop.hdfs.DFSClient.access$800(DFSClient.java:78)
> 	at org.apache.hadoop.hdfs.DFSClient$DFSInputStream.blockSeekTo(DFSClient.java:2231)
> 	at org.apache.hadoop.hdfs.DFSClient$DFSInputStream.read(DFSClient.java:2384)
> 	at java.io.DataInputStream.read(DataInputStream.java:149)
> 	at java.io.BufferedInputStream.fill(BufferedInputStream.java:235)
> 	at java.io.BufferedInputStream.read(BufferedInputStream.java:254)
> 	at org.apache.pig.impl.io.BufferedPositionedInputStream.read(BufferedPositionedInputStream.java:52)
> 	at org.apache.pig.impl.io.InterRecordReader.nextKeyValue(InterRecordReader.java:86)
> 	at org.apache.pig.impl.io.InterStorage.getNext(InterStorage.java:77)
> 	at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigRecordReader.nextKeyValue(PigRecordReader.java:187)
> 	at org.apache.hadoop.mapred.MapTask$NewTrackingRecordReader.nextKeyValue(MapTask.java:533)
> 	at org.apache.hadoop.mapreduce.MapContext.nextKeyValue(MapContext.java:67)
> 	at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:143)
> 	at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:766)
> 	at org.apache.hadoop.mapred.MapTask.run(MapTask.java:370)
> 	at org.apache.hadoop.mapred.Child$4.run(Child.java:255)
> 	at java.security.AccessController.doPrivileged(Native Method)
> 	at javax.security.auth.Subject.doAs(Subject.java:415)
> 	at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1136)
> 	at org.apache.hadoop.mapred.Child.main(Child.java:249)
> Caused by: java.net.SocketTimeoutException: 10000 millis timeout while waiting for channel to be ready for read. ch : java.nio.channels.SocketChannel[connected local=/10.130.110.80:57689 remote=/10.130.110.80:50020]
> 	at org.apache.hadoop.net.SocketIOWithTimeout.doIO(SocketIOWithTimeout.java:164)
> 	at org.apache.hadoop.net.SocketInputStream.read(SocketInputStream.java:155)
> 	at org.apache.hadoop.net.SocketInputStream.read(SocketInputStream.java:128)
> 	at java.io.FilterInputStream.read(FilterInputStream.java:133)
> 	at org.apache.hadoop.ipc.Client$Connection$PingInputStream.read(Client.java:361)
> 	at java.io.BufferedInputStream.fill(BufferedInputStream.java:235)
> 	at java.io.BufferedInputStream.read(BufferedInputStream.java:254)
> 	at java.io.DataInputStream.readInt(DataInputStream.java:387)
> 	at org.apache.hadoop.ipc.Client$Connection.receiveResponse(Client.java:841)
> 	at org.apache.hadoop.ipc.Client$Connection.run(Client.java:786)
>
>
> I checked the source code, the exception happens here:
>       //now wait for socket to be ready.
>       *int* count = 0;
>       *try* {
>         count = *selector*.select(channel, ops, timeout);
>       } *catch* (IOException e) { //unexpected IOException.
>         closed = *true*;
>         *throw* e;
>       }
>
>       *if* (count == 0) {
> //here!!        *throw* *new* SocketTimeoutException(*
> timeoutExceptionString*(channel,
>                                                                 timeout,
> ops));
>       }
>
> Why the selector selected nothing? the data node is not under heavy load ,
> gc, network are all ok.
>  Thanks.
>
> Haitao Yao
> yao.erix@gmail.com
> weibo: @haitao_yao
> Skype:  haitao.yao.final
>
>


-- 
不学习,不知道