You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@nutch.apache.org by Kshitij Shukla <ks...@cisinlabs.com> on 2016/02/15 10:01:39 UTC

[CIS-CMMI-3] ScannerTimeoutException: 157036ms passed since the last invocation, timeout is currently set to 60000

Hello everyone,

During a very large crawl when indexing to Solr this will yield the 
following exception:

**********************************************************START
Parsing :
/home/c1/apache-nutch-2.3.1/runtime/deploy/bin/nutch parse -D 
mapred.reduce.tasks=2 -D mapred.child.java.opts=-Xmx1000m -D 
mapred.reduce.tasks.speculative.execution=false -D 
mapred.map.tasks.speculative.execution=false -D 
mapred.compress.map.output=true -D 
mapred.skip.attempts.to.start.skipping=2 -D 
mapred.skip.map.max.skip.records=1 1455285944-9889 -crawlId 4
16/02/12 18:11:41 INFO parse.ParserJob: ParserJob: starting at 
2016-02-12 18:11:41
16/02/12 18:11:41 INFO parse.ParserJob: ParserJob: resuming: false
16/02/12 18:11:41 INFO parse.ParserJob: ParserJob: forced reparse:    false
16/02/12 18:11:41 INFO parse.ParserJob: ParserJob: batchId: 1455285944-9889
16/02/12 18:11:41 WARN util.NativeCodeLoader: Unable to load 
native-hadoop library for your platform... using builtin-java classes 
where applicable
16/02/12 18:11:41 INFO plugin.PluginRepository: Plugins: looking in: 
/tmp/hadoop-root/hadoop-unjar7610794572323535076/classes/plugins
16/02/12 18:11:41 INFO plugin.PluginRepository: Plugin Auto-activation 
mode: [true]
16/02/12 18:11:41 INFO plugin.PluginRepository: Registered Plugins:
16/02/12 18:11:41 INFO plugin.PluginRepository:     Html Parse Plug-in 
(parse-html)
16/02/12 18:11:41 INFO plugin.PluginRepository:     MetaTags 
(parse-metatags)
16/02/12 18:11:41 INFO plugin.PluginRepository:     HTTP Framework 
(lib-http)
16/02/12 18:11:41 INFO plugin.PluginRepository:     Html Indexing Filter 
(index-html)
16/02/12 18:11:41 INFO plugin.PluginRepository:     the nutch core 
extension points (nutch-extensionpoints)
16/02/12 18:11:41 INFO plugin.PluginRepository:     Basic Indexing 
Filter (index-basic)
16/02/12 18:11:41 INFO plugin.PluginRepository:     XML Libraries (lib-xml)
16/02/12 18:11:41 INFO plugin.PluginRepository:     Anchor Indexing 
Filter (index-anchor)
16/02/12 18:11:41 INFO plugin.PluginRepository:     Basic URL Normalizer 
(urlnormalizer-basic)
16/02/12 18:11:41 INFO plugin.PluginRepository:     Language 
Identification Parser/Filter (language-identifier)
16/02/12 18:11:41 INFO plugin.PluginRepository:     Metadata Indexing 
Filter (index-metadata)
16/02/12 18:11:41 INFO plugin.PluginRepository:     CyberNeko HTML 
Parser (lib-nekohtml)
16/02/12 18:11:41 INFO plugin.PluginRepository:     Subcollection 
indexing and query filter (subcollection)
16/02/12 18:11:41 INFO plugin.PluginRepository:     SOLRIndexWriter 
(indexer-solr)
16/02/12 18:11:41 INFO plugin.PluginRepository:     Rel-Tag microformat 
Parser/Indexer/Querier (microformats-reltag)
16/02/12 18:11:41 INFO plugin.PluginRepository:     Regex URL Filter 
(urlfilter-regex)
16/02/12 18:11:41 INFO plugin.PluginRepository:     Http / Https 
Protocol Plug-in (protocol-httpclient)
16/02/12 18:11:41 INFO plugin.PluginRepository:     JavaScript Parser 
(parse-js)
16/02/12 18:11:41 INFO plugin.PluginRepository:     Tika Parser Plug-in 
(parse-tika)
16/02/12 18:11:41 INFO plugin.PluginRepository:     Top Level Domain 
Plugin (tld)
16/02/12 18:11:41 INFO plugin.PluginRepository:     Regex URL Filter 
Framework (lib-regex-filter)
16/02/12 18:11:41 INFO plugin.PluginRepository:     Regex URL Normalizer 
(urlnormalizer-regex)
16/02/12 18:11:41 INFO plugin.PluginRepository:     Link Analysis 
Scoring Plug-in (scoring-link)
16/02/12 18:11:41 INFO plugin.PluginRepository:     OPIC Scoring Plug-in 
(scoring-opic)
16/02/12 18:11:41 INFO plugin.PluginRepository:     Http Protocol 
Plug-in (protocol-http)
16/02/12 18:11:41 INFO plugin.PluginRepository:     More Indexing Filter 
(index-more)
16/02/12 18:11:41 INFO plugin.PluginRepository:     Creative Commons 
Plugins (creativecommons)
16/02/12 18:11:41 INFO plugin.PluginRepository: Registered Extension-Points:
16/02/12 18:11:41 INFO plugin.PluginRepository:     Parse Filter 
(org.apache.nutch.parse.ParseFilter)
16/02/12 18:11:41 INFO plugin.PluginRepository:     Nutch Index Cleaning 
Filter (org.apache.nutch.indexer.IndexCleaningFilter)
16/02/12 18:11:41 INFO plugin.PluginRepository:     Nutch Content Parser 
(org.apache.nutch.parse.Parser)
16/02/12 18:11:41 INFO plugin.PluginRepository:     Nutch URL Filter 
(org.apache.nutch.net.URLFilter)
16/02/12 18:11:41 INFO plugin.PluginRepository:     Nutch Scoring 
(org.apache.nutch.scoring.ScoringFilter)
16/02/12 18:11:41 INFO plugin.PluginRepository:     Nutch URL Normalizer 
(org.apache.nutch.net.URLNormalizer)
16/02/12 18:11:41 INFO plugin.PluginRepository:     Nutch Protocol 
(org.apache.nutch.protocol.Protocol)
16/02/12 18:11:41 INFO plugin.PluginRepository:     Nutch Index Writer 
(org.apache.nutch.indexer.IndexWriter)
16/02/12 18:11:41 INFO plugin.PluginRepository:     Nutch Indexing 
Filter (org.apache.nutch.indexer.IndexingFilter)
16/02/12 18:11:41 INFO conf.Configuration: found resource 
parse-plugins.xml at 
file:/tmp/hadoop-root/hadoop-unjar7610794572323535076/parse-plugins.xml
16/02/12 18:11:42 INFO crawl.SignatureFactory: Using Signature impl: 
org.apache.nutch.crawl.MD5Signature
16/02/12 18:11:42 INFO Configuration.deprecation: 
mapred.skip.map.max.skip.records is deprecated. Instead, use 
mapreduce.map.skip.maxrecords
16/02/12 18:11:42 INFO Configuration.deprecation: 
mapred.skip.attempts.to.start.skipping is deprecated. Instead, use 
mapreduce.task.skip.start.attempts
16/02/12 18:11:42 INFO Configuration.deprecation: 
mapred.map.tasks.speculative.execution is deprecated. Instead, use 
mapreduce.map.speculative
16/02/12 18:11:42 INFO Configuration.deprecation: 
mapred.reduce.tasks.speculative.execution is deprecated. Instead, use 
mapreduce.reduce.speculative
16/02/12 18:11:42 INFO Configuration.deprecation: 
mapred.compress.map.output is deprecated. Instead, use 
mapreduce.map.output.compress
16/02/12 18:11:42 INFO Configuration.deprecation: mapred.reduce.tasks is 
deprecated. Instead, use mapreduce.job.reduces
16/02/12 18:11:42 INFO zookeeper.RecoverableZooKeeper: Process 
identifier=hconnection-0x63648ee9 connecting to ZooKeeper 
ensemble=ns614.mycyberhosting.com:2181,ns613.mycyberhosting.com:2181,ns615.mycyberhosting.com:2181
16/02/12 18:11:42 INFO zookeeper.ZooKeeper: Client 
environment:zookeeper.version=3.4.6-1569965, built on 02/20/2014 09:09 GMT
16/02/12 18:11:42 INFO zookeeper.ZooKeeper: Client 
environment:host.name=ns613.mycyberhosting.com
16/02/12 18:11:42 INFO zookeeper.ZooKeeper: Client 
environment:java.version=1.8.0_25
16/02/12 18:11:42 INFO zookeeper.ZooKeeper: Client 
environment:java.vendor=Oracle Corporation
16/02/12 18:11:42 INFO zookeeper.ZooKeeper: Client 
environment:java.home=/usr/lib/jvm/jdk1.8.0_25/jre
16/02/12 18:11:45 INFO zookeeper.ZooKeeper: Session: 0x152cf59e03c0155 
closed
16/02/12 18:11:45 INFO zookeeper.ClientCnxn: EventThread shut down
16/02/12 18:11:45 INFO mapreduce.JobSubmitter: number of splits:3
16/02/12 18:11:46 INFO mapreduce.JobSubmitter: Submitting tokens for 
job: job_1455177662473_0067
16/02/12 18:11:46 INFO impl.YarnClientImpl: Submitted application 
application_1455177662473_0067
16/02/12 18:11:46 INFO mapreduce.Job: The url to track the job: 
http://ns613.mycyberhosting.com:8088/proxy/application_1455177662473_0067/
16/02/12 18:11:46 INFO mapreduce.Job: Running job: job_1455177662473_0067
16/02/12 18:11:55 INFO mapreduce.Job: Job job_1455177662473_0067 running 
in uber mode : false
16/02/12 18:11:55 INFO mapreduce.Job:  map 0% reduce 0%
16/02/12 18:15:50 INFO mapreduce.Job: Task Id : 
attempt_1455177662473_0067_m_000000_0, Status : FAILED
Error: java.lang.RuntimeException: 
org.apache.hadoop.hbase.client.ScannerTimeoutException: 157850ms passed 
since the last invocation, timeout is currently set to 60000
     at 
org.apache.gora.mapreduce.GoraRecordReader.nextKeyValue(GoraRecordReader.java:122)
     at 
org.apache.hadoop.mapred.MapTask$NewTrackingRecordReader.nextKeyValue(MapTask.java:533)
     at 
org.apache.hadoop.mapreduce.task.MapContextImpl.nextKeyValue(MapContextImpl.java:80)
     at 
org.apache.hadoop.mapreduce.lib.map.WrappedMapper$Context.nextKeyValue(WrappedMapper.java:91)
     at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:144)
     at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:764)
     at org.apache.hadoop.mapred.MapTask.run(MapTask.java:340)
     at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:168)
     at java.security.AccessController.doPrivileged(Native Method)
     at javax.security.auth.Subject.doAs(Subject.java:422)
     at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1614)
     at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:163)
Caused by: org.apache.hadoop.hbase.client.ScannerTimeoutException: 
157850ms passed since the last invocation, timeout is currently set to 60000
     at 
org.apache.hadoop.hbase.client.ClientScanner.next(ClientScanner.java:371)
     at 
org.apache.gora.hbase.query.HBaseScannerResult.nextInner(HBaseScannerResult.java:49)
     at org.apache.gora.query.impl.ResultBase.next(ResultBase.java:111)
     at 
org.apache.gora.mapreduce.GoraRecordReader.nextKeyValue(GoraRecordReader.java:118)
     ... 11 more
Caused by: org.apache.hadoop.hbase.UnknownScannerException: 
org.apache.hadoop.hbase.UnknownScannerException: Name: 3945, already closed?
     at 
org.apache.hadoop.hbase.regionserver.HRegionServer.scan(HRegionServer.java:3146)
     at 
org.apache.hadoop.hbase.protobuf.generated.ClientProtos$ClientService$2.callBlockingMethod(ClientProtos.java:29941)
     at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:2027)
     at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:108)
     at 
org.apache.hadoop.hbase.ipc.RpcExecutor.consumerLoop(RpcExecutor.java:110)
     at org.apache.hadoop.hbase.ipc.RpcExecutor$1.run(RpcExecutor.java:90)
     at java.lang.Thread.run(Thread.java:745)

     at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native 
Method)
     at 
sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
     at 
sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
     at java.lang.reflect.Constructor.newInstance(Constructor.java:408)
     at 
org.apache.hadoop.ipc.RemoteException.instantiateException(RemoteException.java:106)
     at 
org.apache.hadoop.ipc.RemoteException.unwrapRemoteException(RemoteException.java:95)
     at 
org.apache.hadoop.hbase.protobuf.ProtobufUtil.getRemoteException(ProtobufUtil.java:284)
     at 
org.apache.hadoop.hbase.client.ScannerCallable.call(ScannerCallable.java:204)
     at 
org.apache.hadoop.hbase.client.ScannerCallable.call(ScannerCallable.java:59)
     at 
org.apache.hadoop.hbase.client.RpcRetryingCaller.callWithRetries(RpcRetryingCaller.java:117)
     at 
org.apache.hadoop.hbase.client.RpcRetryingCaller.callWithRetries(RpcRetryingCaller.java:93)
     at 
org.apache.hadoop.hbase.client.ClientScanner.next(ClientScanner.java:355)
     ... 14 more
Caused by: 
org.apache.hadoop.hbase.ipc.RemoteWithExtrasException(org.apache.hadoop.hbase.UnknownScannerException): 
org.apache.hadoop.hbase.UnknownScannerException: Name: 3945, already closed?
     at 
org.apache.hadoop.hbase.regionserver.HRegionServer.scan(HRegionServer.java:3146)
     at 
org.apache.hadoop.hbase.protobuf.generated.ClientProtos$ClientService$2.callBlockingMethod(ClientProtos.java:29941)
     at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:2027)
     at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:108)
     at 
org.apache.hadoop.hbase.ipc.RpcExecutor.consumerLoop(RpcExecutor.java:110)
     at org.apache.hadoop.hbase.ipc.RpcExecutor$1.run(RpcExecutor.java:90)
     at java.lang.Thread.run(Thread.java:745)

     at org.apache.hadoop.hbase.ipc.RpcClient.call(RpcClient.java:1457)
     at 
org.apache.hadoop.hbase.ipc.RpcClient.callBlockingMethod(RpcClient.java:1661)
     at 
org.apache.hadoop.hbase.ipc.RpcClient$BlockingRpcChannelImplementation.callBlockingMethod(RpcClient.java:1719)
     at 
org.apache.hadoop.hbase.protobuf.generated.ClientProtos$ClientService$BlockingStub.scan(ClientProtos.java:30387)
     at 
org.apache.hadoop.hbase.client.ScannerCallable.call(ScannerCallable.java:174)
     ... 18 more

16/02/12 18:17:19 INFO mapreduce.Job:  map 33% reduce 0%
16/02/12 18:18:25 INFO mapreduce.Job:  map 67% reduce 0%
16/02/12 18:19:53 INFO mapreduce.Job: Task Id : 
attempt_1455177662473_0067_m_000000_1, Status : FAILED
Error: java.lang.RuntimeException: 
org.apache.hadoop.hbase.client.ScannerTimeoutException: 157036ms passed 
since the last invocation, timeout is currently set to 60000
     at 
org.apache.gora.mapreduce.GoraRecordReader.nextKeyValue(GoraRecordReader.java:122)
     at 
org.apache.hadoop.mapred.MapTask$NewTrackingRecordReader.nextKeyValue(MapTask.java:533)
     at 
org.apache.hadoop.mapreduce.task.MapContextImpl.nextKeyValue(MapContextImpl.java:80)
     at 
org.apache.hadoop.mapreduce.lib.map.WrappedMapper$Context.nextKeyValue(WrappedMapper.java:91)
     at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:144)
     at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:764)
     at org.apache.hadoop.mapred.MapTask.run(MapTask.java:340)
     at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:168)
     at java.security.AccessController.doPrivileged(Native Method)
     at javax.security.auth.Subject.doAs(Subject.java:422)
     at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1614)
     at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:163)
Caused by: org.apache.hadoop.hbase.client.ScannerTimeoutException: 
157036ms passed since the last invocation, timeout is currently set to 60000
     at 
org.apache.hadoop.hbase.client.ClientScanner.next(ClientScanner.java:371)
     at 
org.apache.gora.hbase.query.HBaseScannerResult.nextInner(HBaseScannerResult.java:49)
     at org.apache.gora.query.impl.ResultBase.next(ResultBase.java:111)
     at 
org.apache.gora.mapreduce.GoraRecordReader.nextKeyValue(GoraRecordReader.java:118)
     ... 11 more
Caused by: org.apache.hadoop.hbase.UnknownScannerException: 
org.apache.hadoop.hbase.UnknownScannerException: Name: 3949, already closed?
     at 
org.apache.hadoop.hbase.regionserver.HRegionServer.scan(HRegionServer.java:3146)
     at 
org.apache.hadoop.hbase.protobuf.generated.ClientProtos$ClientService$2.callBlockingMethod(ClientProtos.java:29941)
     at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:2027)
     at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:108)
     at 
org.apache.hadoop.hbase.ipc.RpcExecutor.consumerLoop(RpcExecutor.java:110)
     at org.apache.hadoop.hbase.ipc.RpcExecutor$1.run(RpcExecutor.java:90)
     at java.lang.Thread.run(Thread.java:745)

     at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native 
Method)
     at 
sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
     at 
sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
     at java.lang.reflect.Constructor.newInstance(Constructor.java:408)
     at 
org.apache.hadoop.ipc.RemoteException.instantiateException(RemoteException.java:106)
     at 
org.apache.hadoop.ipc.RemoteException.unwrapRemoteException(RemoteException.java:95)
     at 
org.apache.hadoop.hbase.protobuf.ProtobufUtil.getRemoteException(ProtobufUtil.java:284)
     at 
org.apache.hadoop.hbase.client.ScannerCallable.call(ScannerCallable.java:204)
     at 
org.apache.hadoop.hbase.client.ScannerCallable.call(ScannerCallable.java:59)
     at 
org.apache.hadoop.hbase.client.RpcRetryingCaller.callWithRetries(RpcRetryingCaller.java:117)
     at 
org.apache.hadoop.hbase.client.RpcRetryingCaller.callWithRetries(RpcRetryingCaller.java:93)
     at 
org.apache.hadoop.hbase.client.ClientScanner.next(ClientScanner.java:355)
     ... 14 more
Caused by: 
org.apache.hadoop.hbase.ipc.RemoteWithExtrasException(org.apache.hadoop.hbase.UnknownScannerException): 
org.apache.hadoop.hbase.UnknownScannerException: Name: 3949, already closed?
     at 
org.apache.hadoop.hbase.regionserver.HRegionServer.scan(HRegionServer.java:3146)
     at 
org.apache.hadoop.hbase.protobuf.generated.ClientProtos$ClientService$2.callBlockingMethod(ClientProtos.java:29941)
     at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:2027)
     at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:108)
     at 
org.apache.hadoop.hbase.ipc.RpcExecutor.consumerLoop(RpcExecutor.java:110)
     at org.apache.hadoop.hbase.ipc.RpcExecutor$1.run(RpcExecutor.java:90)
     at java.lang.Thread.run(Thread.java:745)

     at org.apache.hadoop.hbase.ipc.RpcClient.call(RpcClient.java:1457)
     at 
org.apache.hadoop.hbase.ipc.RpcClient.callBlockingMethod(RpcClient.java:1661)
     at 
org.apache.hadoop.hbase.ipc.RpcClient$BlockingRpcChannelImplementation.callBlockingMethod(RpcClient.java:1719)
     at 
org.apache.hadoop.hbase.protobuf.generated.ClientProtos$ClientService$BlockingStub.scan(ClientProtos.java:30387)
     at 
org.apache.hadoop.hbase.client.ScannerCallable.call(ScannerCallable.java:174)
     ... 18 more

16/02/12 18:23:44 INFO mapreduce.Job: Task Id : 
attempt_1455177662473_0067_m_000000_2, Status : FAILED
Error: java.lang.RuntimeException: 
org.apache.hadoop.hbase.client.ScannerTimeoutException: 157740ms passed 
since the last invocation, timeout is currently set to 60000
     at 
org.apache.gora.mapreduce.GoraRecordReader.nextKeyValue(GoraRecordReader.java:122)
     at 
org.apache.hadoop.mapred.MapTask$NewTrackingRecordReader.nextKeyValue(MapTask.java:533)
     at 
org.apache.hadoop.mapreduce.task.MapContextImpl.nextKeyValue(MapContextImpl.java:80)
     at 
org.apache.hadoop.mapreduce.lib.map.WrappedMapper$Context.nextKeyValue(WrappedMapper.java:91)
     at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:144)
     at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:764)
     at org.apache.hadoop.mapred.MapTask.run(MapTask.java:340)
     at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:168)
     at java.security.AccessController.doPrivileged(Native Method)
     at javax.security.auth.Subject.doAs(Subject.java:422)
     at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1614)
     at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:163)
Caused by: org.apache.hadoop.hbase.client.ScannerTimeoutException: 
157740ms passed since the last invocation, timeout is currently set to 60000
     at 
org.apache.hadoop.hbase.client.ClientScanner.next(ClientScanner.java:371)
     at 
org.apache.gora.hbase.query.HBaseScannerResult.nextInner(HBaseScannerResult.java:49)
     at org.apache.gora.query.impl.ResultBase.next(ResultBase.java:111)
     at 
org.apache.gora.mapreduce.GoraRecordReader.nextKeyValue(GoraRecordReader.java:118)
     ... 11 more
Caused by: org.apache.hadoop.hbase.UnknownScannerException: 
org.apache.hadoop.hbase.UnknownScannerException: Name: 3951, already closed?
     at 
org.apache.hadoop.hbase.regionserver.HRegionServer.scan(HRegionServer.java:3146)
     at 
org.apache.hadoop.hbase.protobuf.generated.ClientProtos$ClientService$2.callBlockingMethod(ClientProtos.java:29941)
     at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:2027)
     at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:108)
     at 
org.apache.hadoop.hbase.ipc.RpcExecutor.consumerLoop(RpcExecutor.java:110)
     at org.apache.hadoop.hbase.ipc.RpcExecutor$1.run(RpcExecutor.java:90)
     at java.lang.Thread.run(Thread.java:745)

     at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native 
Method)
     at 
sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
     at 
sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
     at java.lang.reflect.Constructor.newInstance(Constructor.java:408)
     at 
org.apache.hadoop.ipc.RemoteException.instantiateException(RemoteException.java:106)
     at 
org.apache.hadoop.ipc.RemoteException.unwrapRemoteException(RemoteException.java:95)
     at 
org.apache.hadoop.hbase.protobuf.ProtobufUtil.getRemoteException(ProtobufUtil.java:284)
     at 
org.apache.hadoop.hbase.client.ScannerCallable.call(ScannerCallable.java:204)
     at 
org.apache.hadoop.hbase.client.ScannerCallable.call(ScannerCallable.java:59)
     at 
org.apache.hadoop.hbase.client.RpcRetryingCaller.callWithRetries(RpcRetryingCaller.java:117)
     at 
org.apache.hadoop.hbase.client.RpcRetryingCaller.callWithRetries(RpcRetryingCaller.java:93)
     at 
org.apache.hadoop.hbase.client.ClientScanner.next(ClientScanner.java:355)
     ... 14 more
Caused by: 
org.apache.hadoop.hbase.ipc.RemoteWithExtrasException(org.apache.hadoop.hbase.UnknownScannerException): 
org.apache.hadoop.hbase.UnknownScannerException: Name: 3951, already closed?
     at 
org.apache.hadoop.hbase.regionserver.HRegionServer.scan(HRegionServer.java:3146)
     at 
org.apache.hadoop.hbase.protobuf.generated.ClientProtos$ClientService$2.callBlockingMethod(ClientProtos.java:29941)
     at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:2027)
     at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:108)
     at 
org.apache.hadoop.hbase.ipc.RpcExecutor.consumerLoop(RpcExecutor.java:110)
     at org.apache.hadoop.hbase.ipc.RpcExecutor$1.run(RpcExecutor.java:90)
     at java.lang.Thread.run(Thread.java:745)

     at org.apache.hadoop.hbase.ipc.RpcClient.call(RpcClient.java:1457)
     at 
org.apache.hadoop.hbase.ipc.RpcClient.callBlockingMethod(RpcClient.java:1661)
     at 
org.apache.hadoop.hbase.ipc.RpcClient$BlockingRpcChannelImplementation.callBlockingMethod(RpcClient.java:1719)
     at 
org.apache.hadoop.hbase.protobuf.generated.ClientProtos$ClientService$BlockingStub.scan(ClientProtos.java:30387)
     at 
org.apache.hadoop.hbase.client.ScannerCallable.call(ScannerCallable.java:174)
     ... 18 more

16/02/12 18:27:45 INFO mapreduce.Job:  map 100% reduce 0%
16/02/12 18:27:46 INFO mapreduce.Job: Job job_1455177662473_0067 failed 
with state FAILED due to: Task failed task_1455177662473_0067_m_000000
Job failed as tasks failed. failedMaps:1 failedReduces:0

16/02/12 18:27:46 INFO mapreduce.Job: Counters: 34
     File System Counters
         FILE: Number of bytes read=0
         FILE: Number of bytes written=232832
         FILE: Number of read operations=0
         FILE: Number of large read operations=0
         FILE: Number of write operations=0
         HDFS: Number of bytes read=2289
         HDFS: Number of bytes written=0
         HDFS: Number of read operations=2
         HDFS: Number of large read operations=0
         HDFS: Number of write operations=0
     Job Counters
         Failed map tasks=4
         Launched map tasks=6
         Other local map tasks=3
         Data-local map tasks=3
         Total time spent by all maps in occupied slots (ms)=3306258
         Total time spent by all reduces in occupied slots (ms)=0
         Total time spent by all map tasks (ms)=1653129
         Total vcore-seconds taken by all map tasks=1653129
         Total megabyte-seconds taken by all map tasks=6771216384
     Map-Reduce Framework
         Map input records=31186
         Map output records=30298
         Input split bytes=2289
         Spilled Records=0
         Failed Shuffles=0
         Merged Map outputs=0
         GC time elapsed (ms)=4836
         CPU time spent (ms)=577240
         Physical memory (bytes) snapshot=2860380160
         Virtual memory (bytes) snapshot=10654486528
         Total committed heap usage (bytes)=3088056320
     ParserStatus
         failed=27
         success=22978
     File Input Format Counters
         Bytes Read=0
     File Output Format Counters
         Bytes Written=0
Exception in thread "main" java.lang.RuntimeException: job failed: 
name=[4]parse, jobid=job_1455177662473_0067
     at org.apache.nutch.util.NutchJob.waitForCompletion(NutchJob.java:120)
     at org.apache.nutch.parse.ParserJob.run(ParserJob.java:260)
     at org.apache.nutch.parse.ParserJob.parse(ParserJob.java:286)
     at org.apache.nutch.parse.ParserJob.run(ParserJob.java:337)
     at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)
     at org.apache.nutch.parse.ParserJob.main(ParserJob.java:341)
     at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
     at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
     at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
     at java.lang.reflect.Method.invoke(Method.java:483)
     at org.apache.hadoop.util.RunJar.main(RunJar.java:212)
Error running:
   /home/c1/apache-nutch-2.3.1/runtime/deploy/bin/nutch parse -D 
mapred.reduce.tasks=2 -D mapred.child.java.opts=-Xmx1000m -D 
mapred.reduce.tasks.speculative.execution=false -D 
mapred.map.tasks.speculative.execution=false -D 
mapred.compress.map.output=true -D 
mapred.skip.attempts.to.start.skipping=2 -D 
mapred.skip.map.max.skip.records=1 1455285944-9889 -crawlId 4
Failed with exit value 1.
**********************************************************END

Please advise.
-- 

Please let me know if you have any questions , concerns or updates.
Have a great day ahead :)

Thanks and Regards,

Kshitij Shukla
Software developer

*Cyber Infrastructure(CIS)
**/The RightSourcing Specialists with 1250 man years of experience!/*

DISCLAIMER:  INFORMATION PRIVACY is important for us, If you are not the 
intended recipient, you should delete this message and are notified that 
any disclosure, copying or distribution of this message, or taking any 
action based on it, is strictly prohibited by Law.

Please don't print this e-mail unless you really need to.

-- 

------------------------------

*Cyber Infrastructure (P) Limited, [CIS] **(CMMI Level 3 Certified)*

Central India's largest Technology company.

*Ensuring the success of our clients and partners through our highly 
optimized Technology solutions.*

www.cisin.com | +Cisin <https://plus.google.com/+Cisin/> | Linkedin 
<https://www.linkedin.com/company/cyber-infrastructure-private-limited> | 
Offices: *Indore, India.* *Singapore. Silicon Valley, USA*.

DISCLAIMER:  INFORMATION PRIVACY is important for us, If you are not the 
intended recipient, you should delete this message and are notified that 
any disclosure, copying or distribution of this message, or taking any 
action based on it, is strictly prohibited by Law.