You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@nutch.apache.org by Kshitij Shukla <ks...@cisinlabs.com> on 2016/02/15 10:01:39 UTC
[CIS-CMMI-3] ScannerTimeoutException: 157036ms passed since the last invocation,
timeout is currently set to 60000
Hello everyone,
During a very large crawl when indexing to Solr this will yield the
following exception:
**********************************************************START
Parsing :
/home/c1/apache-nutch-2.3.1/runtime/deploy/bin/nutch parse -D
mapred.reduce.tasks=2 -D mapred.child.java.opts=-Xmx1000m -D
mapred.reduce.tasks.speculative.execution=false -D
mapred.map.tasks.speculative.execution=false -D
mapred.compress.map.output=true -D
mapred.skip.attempts.to.start.skipping=2 -D
mapred.skip.map.max.skip.records=1 1455285944-9889 -crawlId 4
16/02/12 18:11:41 INFO parse.ParserJob: ParserJob: starting at
2016-02-12 18:11:41
16/02/12 18:11:41 INFO parse.ParserJob: ParserJob: resuming: false
16/02/12 18:11:41 INFO parse.ParserJob: ParserJob: forced reparse: false
16/02/12 18:11:41 INFO parse.ParserJob: ParserJob: batchId: 1455285944-9889
16/02/12 18:11:41 WARN util.NativeCodeLoader: Unable to load
native-hadoop library for your platform... using builtin-java classes
where applicable
16/02/12 18:11:41 INFO plugin.PluginRepository: Plugins: looking in:
/tmp/hadoop-root/hadoop-unjar7610794572323535076/classes/plugins
16/02/12 18:11:41 INFO plugin.PluginRepository: Plugin Auto-activation
mode: [true]
16/02/12 18:11:41 INFO plugin.PluginRepository: Registered Plugins:
16/02/12 18:11:41 INFO plugin.PluginRepository: Html Parse Plug-in
(parse-html)
16/02/12 18:11:41 INFO plugin.PluginRepository: MetaTags
(parse-metatags)
16/02/12 18:11:41 INFO plugin.PluginRepository: HTTP Framework
(lib-http)
16/02/12 18:11:41 INFO plugin.PluginRepository: Html Indexing Filter
(index-html)
16/02/12 18:11:41 INFO plugin.PluginRepository: the nutch core
extension points (nutch-extensionpoints)
16/02/12 18:11:41 INFO plugin.PluginRepository: Basic Indexing
Filter (index-basic)
16/02/12 18:11:41 INFO plugin.PluginRepository: XML Libraries (lib-xml)
16/02/12 18:11:41 INFO plugin.PluginRepository: Anchor Indexing
Filter (index-anchor)
16/02/12 18:11:41 INFO plugin.PluginRepository: Basic URL Normalizer
(urlnormalizer-basic)
16/02/12 18:11:41 INFO plugin.PluginRepository: Language
Identification Parser/Filter (language-identifier)
16/02/12 18:11:41 INFO plugin.PluginRepository: Metadata Indexing
Filter (index-metadata)
16/02/12 18:11:41 INFO plugin.PluginRepository: CyberNeko HTML
Parser (lib-nekohtml)
16/02/12 18:11:41 INFO plugin.PluginRepository: Subcollection
indexing and query filter (subcollection)
16/02/12 18:11:41 INFO plugin.PluginRepository: SOLRIndexWriter
(indexer-solr)
16/02/12 18:11:41 INFO plugin.PluginRepository: Rel-Tag microformat
Parser/Indexer/Querier (microformats-reltag)
16/02/12 18:11:41 INFO plugin.PluginRepository: Regex URL Filter
(urlfilter-regex)
16/02/12 18:11:41 INFO plugin.PluginRepository: Http / Https
Protocol Plug-in (protocol-httpclient)
16/02/12 18:11:41 INFO plugin.PluginRepository: JavaScript Parser
(parse-js)
16/02/12 18:11:41 INFO plugin.PluginRepository: Tika Parser Plug-in
(parse-tika)
16/02/12 18:11:41 INFO plugin.PluginRepository: Top Level Domain
Plugin (tld)
16/02/12 18:11:41 INFO plugin.PluginRepository: Regex URL Filter
Framework (lib-regex-filter)
16/02/12 18:11:41 INFO plugin.PluginRepository: Regex URL Normalizer
(urlnormalizer-regex)
16/02/12 18:11:41 INFO plugin.PluginRepository: Link Analysis
Scoring Plug-in (scoring-link)
16/02/12 18:11:41 INFO plugin.PluginRepository: OPIC Scoring Plug-in
(scoring-opic)
16/02/12 18:11:41 INFO plugin.PluginRepository: Http Protocol
Plug-in (protocol-http)
16/02/12 18:11:41 INFO plugin.PluginRepository: More Indexing Filter
(index-more)
16/02/12 18:11:41 INFO plugin.PluginRepository: Creative Commons
Plugins (creativecommons)
16/02/12 18:11:41 INFO plugin.PluginRepository: Registered Extension-Points:
16/02/12 18:11:41 INFO plugin.PluginRepository: Parse Filter
(org.apache.nutch.parse.ParseFilter)
16/02/12 18:11:41 INFO plugin.PluginRepository: Nutch Index Cleaning
Filter (org.apache.nutch.indexer.IndexCleaningFilter)
16/02/12 18:11:41 INFO plugin.PluginRepository: Nutch Content Parser
(org.apache.nutch.parse.Parser)
16/02/12 18:11:41 INFO plugin.PluginRepository: Nutch URL Filter
(org.apache.nutch.net.URLFilter)
16/02/12 18:11:41 INFO plugin.PluginRepository: Nutch Scoring
(org.apache.nutch.scoring.ScoringFilter)
16/02/12 18:11:41 INFO plugin.PluginRepository: Nutch URL Normalizer
(org.apache.nutch.net.URLNormalizer)
16/02/12 18:11:41 INFO plugin.PluginRepository: Nutch Protocol
(org.apache.nutch.protocol.Protocol)
16/02/12 18:11:41 INFO plugin.PluginRepository: Nutch Index Writer
(org.apache.nutch.indexer.IndexWriter)
16/02/12 18:11:41 INFO plugin.PluginRepository: Nutch Indexing
Filter (org.apache.nutch.indexer.IndexingFilter)
16/02/12 18:11:41 INFO conf.Configuration: found resource
parse-plugins.xml at
file:/tmp/hadoop-root/hadoop-unjar7610794572323535076/parse-plugins.xml
16/02/12 18:11:42 INFO crawl.SignatureFactory: Using Signature impl:
org.apache.nutch.crawl.MD5Signature
16/02/12 18:11:42 INFO Configuration.deprecation:
mapred.skip.map.max.skip.records is deprecated. Instead, use
mapreduce.map.skip.maxrecords
16/02/12 18:11:42 INFO Configuration.deprecation:
mapred.skip.attempts.to.start.skipping is deprecated. Instead, use
mapreduce.task.skip.start.attempts
16/02/12 18:11:42 INFO Configuration.deprecation:
mapred.map.tasks.speculative.execution is deprecated. Instead, use
mapreduce.map.speculative
16/02/12 18:11:42 INFO Configuration.deprecation:
mapred.reduce.tasks.speculative.execution is deprecated. Instead, use
mapreduce.reduce.speculative
16/02/12 18:11:42 INFO Configuration.deprecation:
mapred.compress.map.output is deprecated. Instead, use
mapreduce.map.output.compress
16/02/12 18:11:42 INFO Configuration.deprecation: mapred.reduce.tasks is
deprecated. Instead, use mapreduce.job.reduces
16/02/12 18:11:42 INFO zookeeper.RecoverableZooKeeper: Process
identifier=hconnection-0x63648ee9 connecting to ZooKeeper
ensemble=ns614.mycyberhosting.com:2181,ns613.mycyberhosting.com:2181,ns615.mycyberhosting.com:2181
16/02/12 18:11:42 INFO zookeeper.ZooKeeper: Client
environment:zookeeper.version=3.4.6-1569965, built on 02/20/2014 09:09 GMT
16/02/12 18:11:42 INFO zookeeper.ZooKeeper: Client
environment:host.name=ns613.mycyberhosting.com
16/02/12 18:11:42 INFO zookeeper.ZooKeeper: Client
environment:java.version=1.8.0_25
16/02/12 18:11:42 INFO zookeeper.ZooKeeper: Client
environment:java.vendor=Oracle Corporation
16/02/12 18:11:42 INFO zookeeper.ZooKeeper: Client
environment:java.home=/usr/lib/jvm/jdk1.8.0_25/jre
16/02/12 18:11:45 INFO zookeeper.ZooKeeper: Session: 0x152cf59e03c0155
closed
16/02/12 18:11:45 INFO zookeeper.ClientCnxn: EventThread shut down
16/02/12 18:11:45 INFO mapreduce.JobSubmitter: number of splits:3
16/02/12 18:11:46 INFO mapreduce.JobSubmitter: Submitting tokens for
job: job_1455177662473_0067
16/02/12 18:11:46 INFO impl.YarnClientImpl: Submitted application
application_1455177662473_0067
16/02/12 18:11:46 INFO mapreduce.Job: The url to track the job:
http://ns613.mycyberhosting.com:8088/proxy/application_1455177662473_0067/
16/02/12 18:11:46 INFO mapreduce.Job: Running job: job_1455177662473_0067
16/02/12 18:11:55 INFO mapreduce.Job: Job job_1455177662473_0067 running
in uber mode : false
16/02/12 18:11:55 INFO mapreduce.Job: map 0% reduce 0%
16/02/12 18:15:50 INFO mapreduce.Job: Task Id :
attempt_1455177662473_0067_m_000000_0, Status : FAILED
Error: java.lang.RuntimeException:
org.apache.hadoop.hbase.client.ScannerTimeoutException: 157850ms passed
since the last invocation, timeout is currently set to 60000
at
org.apache.gora.mapreduce.GoraRecordReader.nextKeyValue(GoraRecordReader.java:122)
at
org.apache.hadoop.mapred.MapTask$NewTrackingRecordReader.nextKeyValue(MapTask.java:533)
at
org.apache.hadoop.mapreduce.task.MapContextImpl.nextKeyValue(MapContextImpl.java:80)
at
org.apache.hadoop.mapreduce.lib.map.WrappedMapper$Context.nextKeyValue(WrappedMapper.java:91)
at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:144)
at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:764)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:340)
at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:168)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1614)
at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:163)
Caused by: org.apache.hadoop.hbase.client.ScannerTimeoutException:
157850ms passed since the last invocation, timeout is currently set to 60000
at
org.apache.hadoop.hbase.client.ClientScanner.next(ClientScanner.java:371)
at
org.apache.gora.hbase.query.HBaseScannerResult.nextInner(HBaseScannerResult.java:49)
at org.apache.gora.query.impl.ResultBase.next(ResultBase.java:111)
at
org.apache.gora.mapreduce.GoraRecordReader.nextKeyValue(GoraRecordReader.java:118)
... 11 more
Caused by: org.apache.hadoop.hbase.UnknownScannerException:
org.apache.hadoop.hbase.UnknownScannerException: Name: 3945, already closed?
at
org.apache.hadoop.hbase.regionserver.HRegionServer.scan(HRegionServer.java:3146)
at
org.apache.hadoop.hbase.protobuf.generated.ClientProtos$ClientService$2.callBlockingMethod(ClientProtos.java:29941)
at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:2027)
at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:108)
at
org.apache.hadoop.hbase.ipc.RpcExecutor.consumerLoop(RpcExecutor.java:110)
at org.apache.hadoop.hbase.ipc.RpcExecutor$1.run(RpcExecutor.java:90)
at java.lang.Thread.run(Thread.java:745)
at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native
Method)
at
sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
at
sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
at java.lang.reflect.Constructor.newInstance(Constructor.java:408)
at
org.apache.hadoop.ipc.RemoteException.instantiateException(RemoteException.java:106)
at
org.apache.hadoop.ipc.RemoteException.unwrapRemoteException(RemoteException.java:95)
at
org.apache.hadoop.hbase.protobuf.ProtobufUtil.getRemoteException(ProtobufUtil.java:284)
at
org.apache.hadoop.hbase.client.ScannerCallable.call(ScannerCallable.java:204)
at
org.apache.hadoop.hbase.client.ScannerCallable.call(ScannerCallable.java:59)
at
org.apache.hadoop.hbase.client.RpcRetryingCaller.callWithRetries(RpcRetryingCaller.java:117)
at
org.apache.hadoop.hbase.client.RpcRetryingCaller.callWithRetries(RpcRetryingCaller.java:93)
at
org.apache.hadoop.hbase.client.ClientScanner.next(ClientScanner.java:355)
... 14 more
Caused by:
org.apache.hadoop.hbase.ipc.RemoteWithExtrasException(org.apache.hadoop.hbase.UnknownScannerException):
org.apache.hadoop.hbase.UnknownScannerException: Name: 3945, already closed?
at
org.apache.hadoop.hbase.regionserver.HRegionServer.scan(HRegionServer.java:3146)
at
org.apache.hadoop.hbase.protobuf.generated.ClientProtos$ClientService$2.callBlockingMethod(ClientProtos.java:29941)
at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:2027)
at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:108)
at
org.apache.hadoop.hbase.ipc.RpcExecutor.consumerLoop(RpcExecutor.java:110)
at org.apache.hadoop.hbase.ipc.RpcExecutor$1.run(RpcExecutor.java:90)
at java.lang.Thread.run(Thread.java:745)
at org.apache.hadoop.hbase.ipc.RpcClient.call(RpcClient.java:1457)
at
org.apache.hadoop.hbase.ipc.RpcClient.callBlockingMethod(RpcClient.java:1661)
at
org.apache.hadoop.hbase.ipc.RpcClient$BlockingRpcChannelImplementation.callBlockingMethod(RpcClient.java:1719)
at
org.apache.hadoop.hbase.protobuf.generated.ClientProtos$ClientService$BlockingStub.scan(ClientProtos.java:30387)
at
org.apache.hadoop.hbase.client.ScannerCallable.call(ScannerCallable.java:174)
... 18 more
16/02/12 18:17:19 INFO mapreduce.Job: map 33% reduce 0%
16/02/12 18:18:25 INFO mapreduce.Job: map 67% reduce 0%
16/02/12 18:19:53 INFO mapreduce.Job: Task Id :
attempt_1455177662473_0067_m_000000_1, Status : FAILED
Error: java.lang.RuntimeException:
org.apache.hadoop.hbase.client.ScannerTimeoutException: 157036ms passed
since the last invocation, timeout is currently set to 60000
at
org.apache.gora.mapreduce.GoraRecordReader.nextKeyValue(GoraRecordReader.java:122)
at
org.apache.hadoop.mapred.MapTask$NewTrackingRecordReader.nextKeyValue(MapTask.java:533)
at
org.apache.hadoop.mapreduce.task.MapContextImpl.nextKeyValue(MapContextImpl.java:80)
at
org.apache.hadoop.mapreduce.lib.map.WrappedMapper$Context.nextKeyValue(WrappedMapper.java:91)
at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:144)
at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:764)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:340)
at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:168)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1614)
at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:163)
Caused by: org.apache.hadoop.hbase.client.ScannerTimeoutException:
157036ms passed since the last invocation, timeout is currently set to 60000
at
org.apache.hadoop.hbase.client.ClientScanner.next(ClientScanner.java:371)
at
org.apache.gora.hbase.query.HBaseScannerResult.nextInner(HBaseScannerResult.java:49)
at org.apache.gora.query.impl.ResultBase.next(ResultBase.java:111)
at
org.apache.gora.mapreduce.GoraRecordReader.nextKeyValue(GoraRecordReader.java:118)
... 11 more
Caused by: org.apache.hadoop.hbase.UnknownScannerException:
org.apache.hadoop.hbase.UnknownScannerException: Name: 3949, already closed?
at
org.apache.hadoop.hbase.regionserver.HRegionServer.scan(HRegionServer.java:3146)
at
org.apache.hadoop.hbase.protobuf.generated.ClientProtos$ClientService$2.callBlockingMethod(ClientProtos.java:29941)
at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:2027)
at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:108)
at
org.apache.hadoop.hbase.ipc.RpcExecutor.consumerLoop(RpcExecutor.java:110)
at org.apache.hadoop.hbase.ipc.RpcExecutor$1.run(RpcExecutor.java:90)
at java.lang.Thread.run(Thread.java:745)
at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native
Method)
at
sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
at
sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
at java.lang.reflect.Constructor.newInstance(Constructor.java:408)
at
org.apache.hadoop.ipc.RemoteException.instantiateException(RemoteException.java:106)
at
org.apache.hadoop.ipc.RemoteException.unwrapRemoteException(RemoteException.java:95)
at
org.apache.hadoop.hbase.protobuf.ProtobufUtil.getRemoteException(ProtobufUtil.java:284)
at
org.apache.hadoop.hbase.client.ScannerCallable.call(ScannerCallable.java:204)
at
org.apache.hadoop.hbase.client.ScannerCallable.call(ScannerCallable.java:59)
at
org.apache.hadoop.hbase.client.RpcRetryingCaller.callWithRetries(RpcRetryingCaller.java:117)
at
org.apache.hadoop.hbase.client.RpcRetryingCaller.callWithRetries(RpcRetryingCaller.java:93)
at
org.apache.hadoop.hbase.client.ClientScanner.next(ClientScanner.java:355)
... 14 more
Caused by:
org.apache.hadoop.hbase.ipc.RemoteWithExtrasException(org.apache.hadoop.hbase.UnknownScannerException):
org.apache.hadoop.hbase.UnknownScannerException: Name: 3949, already closed?
at
org.apache.hadoop.hbase.regionserver.HRegionServer.scan(HRegionServer.java:3146)
at
org.apache.hadoop.hbase.protobuf.generated.ClientProtos$ClientService$2.callBlockingMethod(ClientProtos.java:29941)
at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:2027)
at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:108)
at
org.apache.hadoop.hbase.ipc.RpcExecutor.consumerLoop(RpcExecutor.java:110)
at org.apache.hadoop.hbase.ipc.RpcExecutor$1.run(RpcExecutor.java:90)
at java.lang.Thread.run(Thread.java:745)
at org.apache.hadoop.hbase.ipc.RpcClient.call(RpcClient.java:1457)
at
org.apache.hadoop.hbase.ipc.RpcClient.callBlockingMethod(RpcClient.java:1661)
at
org.apache.hadoop.hbase.ipc.RpcClient$BlockingRpcChannelImplementation.callBlockingMethod(RpcClient.java:1719)
at
org.apache.hadoop.hbase.protobuf.generated.ClientProtos$ClientService$BlockingStub.scan(ClientProtos.java:30387)
at
org.apache.hadoop.hbase.client.ScannerCallable.call(ScannerCallable.java:174)
... 18 more
16/02/12 18:23:44 INFO mapreduce.Job: Task Id :
attempt_1455177662473_0067_m_000000_2, Status : FAILED
Error: java.lang.RuntimeException:
org.apache.hadoop.hbase.client.ScannerTimeoutException: 157740ms passed
since the last invocation, timeout is currently set to 60000
at
org.apache.gora.mapreduce.GoraRecordReader.nextKeyValue(GoraRecordReader.java:122)
at
org.apache.hadoop.mapred.MapTask$NewTrackingRecordReader.nextKeyValue(MapTask.java:533)
at
org.apache.hadoop.mapreduce.task.MapContextImpl.nextKeyValue(MapContextImpl.java:80)
at
org.apache.hadoop.mapreduce.lib.map.WrappedMapper$Context.nextKeyValue(WrappedMapper.java:91)
at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:144)
at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:764)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:340)
at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:168)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1614)
at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:163)
Caused by: org.apache.hadoop.hbase.client.ScannerTimeoutException:
157740ms passed since the last invocation, timeout is currently set to 60000
at
org.apache.hadoop.hbase.client.ClientScanner.next(ClientScanner.java:371)
at
org.apache.gora.hbase.query.HBaseScannerResult.nextInner(HBaseScannerResult.java:49)
at org.apache.gora.query.impl.ResultBase.next(ResultBase.java:111)
at
org.apache.gora.mapreduce.GoraRecordReader.nextKeyValue(GoraRecordReader.java:118)
... 11 more
Caused by: org.apache.hadoop.hbase.UnknownScannerException:
org.apache.hadoop.hbase.UnknownScannerException: Name: 3951, already closed?
at
org.apache.hadoop.hbase.regionserver.HRegionServer.scan(HRegionServer.java:3146)
at
org.apache.hadoop.hbase.protobuf.generated.ClientProtos$ClientService$2.callBlockingMethod(ClientProtos.java:29941)
at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:2027)
at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:108)
at
org.apache.hadoop.hbase.ipc.RpcExecutor.consumerLoop(RpcExecutor.java:110)
at org.apache.hadoop.hbase.ipc.RpcExecutor$1.run(RpcExecutor.java:90)
at java.lang.Thread.run(Thread.java:745)
at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native
Method)
at
sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
at
sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
at java.lang.reflect.Constructor.newInstance(Constructor.java:408)
at
org.apache.hadoop.ipc.RemoteException.instantiateException(RemoteException.java:106)
at
org.apache.hadoop.ipc.RemoteException.unwrapRemoteException(RemoteException.java:95)
at
org.apache.hadoop.hbase.protobuf.ProtobufUtil.getRemoteException(ProtobufUtil.java:284)
at
org.apache.hadoop.hbase.client.ScannerCallable.call(ScannerCallable.java:204)
at
org.apache.hadoop.hbase.client.ScannerCallable.call(ScannerCallable.java:59)
at
org.apache.hadoop.hbase.client.RpcRetryingCaller.callWithRetries(RpcRetryingCaller.java:117)
at
org.apache.hadoop.hbase.client.RpcRetryingCaller.callWithRetries(RpcRetryingCaller.java:93)
at
org.apache.hadoop.hbase.client.ClientScanner.next(ClientScanner.java:355)
... 14 more
Caused by:
org.apache.hadoop.hbase.ipc.RemoteWithExtrasException(org.apache.hadoop.hbase.UnknownScannerException):
org.apache.hadoop.hbase.UnknownScannerException: Name: 3951, already closed?
at
org.apache.hadoop.hbase.regionserver.HRegionServer.scan(HRegionServer.java:3146)
at
org.apache.hadoop.hbase.protobuf.generated.ClientProtos$ClientService$2.callBlockingMethod(ClientProtos.java:29941)
at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:2027)
at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:108)
at
org.apache.hadoop.hbase.ipc.RpcExecutor.consumerLoop(RpcExecutor.java:110)
at org.apache.hadoop.hbase.ipc.RpcExecutor$1.run(RpcExecutor.java:90)
at java.lang.Thread.run(Thread.java:745)
at org.apache.hadoop.hbase.ipc.RpcClient.call(RpcClient.java:1457)
at
org.apache.hadoop.hbase.ipc.RpcClient.callBlockingMethod(RpcClient.java:1661)
at
org.apache.hadoop.hbase.ipc.RpcClient$BlockingRpcChannelImplementation.callBlockingMethod(RpcClient.java:1719)
at
org.apache.hadoop.hbase.protobuf.generated.ClientProtos$ClientService$BlockingStub.scan(ClientProtos.java:30387)
at
org.apache.hadoop.hbase.client.ScannerCallable.call(ScannerCallable.java:174)
... 18 more
16/02/12 18:27:45 INFO mapreduce.Job: map 100% reduce 0%
16/02/12 18:27:46 INFO mapreduce.Job: Job job_1455177662473_0067 failed
with state FAILED due to: Task failed task_1455177662473_0067_m_000000
Job failed as tasks failed. failedMaps:1 failedReduces:0
16/02/12 18:27:46 INFO mapreduce.Job: Counters: 34
File System Counters
FILE: Number of bytes read=0
FILE: Number of bytes written=232832
FILE: Number of read operations=0
FILE: Number of large read operations=0
FILE: Number of write operations=0
HDFS: Number of bytes read=2289
HDFS: Number of bytes written=0
HDFS: Number of read operations=2
HDFS: Number of large read operations=0
HDFS: Number of write operations=0
Job Counters
Failed map tasks=4
Launched map tasks=6
Other local map tasks=3
Data-local map tasks=3
Total time spent by all maps in occupied slots (ms)=3306258
Total time spent by all reduces in occupied slots (ms)=0
Total time spent by all map tasks (ms)=1653129
Total vcore-seconds taken by all map tasks=1653129
Total megabyte-seconds taken by all map tasks=6771216384
Map-Reduce Framework
Map input records=31186
Map output records=30298
Input split bytes=2289
Spilled Records=0
Failed Shuffles=0
Merged Map outputs=0
GC time elapsed (ms)=4836
CPU time spent (ms)=577240
Physical memory (bytes) snapshot=2860380160
Virtual memory (bytes) snapshot=10654486528
Total committed heap usage (bytes)=3088056320
ParserStatus
failed=27
success=22978
File Input Format Counters
Bytes Read=0
File Output Format Counters
Bytes Written=0
Exception in thread "main" java.lang.RuntimeException: job failed:
name=[4]parse, jobid=job_1455177662473_0067
at org.apache.nutch.util.NutchJob.waitForCompletion(NutchJob.java:120)
at org.apache.nutch.parse.ParserJob.run(ParserJob.java:260)
at org.apache.nutch.parse.ParserJob.parse(ParserJob.java:286)
at org.apache.nutch.parse.ParserJob.run(ParserJob.java:337)
at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)
at org.apache.nutch.parse.ParserJob.main(ParserJob.java:341)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:483)
at org.apache.hadoop.util.RunJar.main(RunJar.java:212)
Error running:
/home/c1/apache-nutch-2.3.1/runtime/deploy/bin/nutch parse -D
mapred.reduce.tasks=2 -D mapred.child.java.opts=-Xmx1000m -D
mapred.reduce.tasks.speculative.execution=false -D
mapred.map.tasks.speculative.execution=false -D
mapred.compress.map.output=true -D
mapred.skip.attempts.to.start.skipping=2 -D
mapred.skip.map.max.skip.records=1 1455285944-9889 -crawlId 4
Failed with exit value 1.
**********************************************************END
Please advise.
--
Please let me know if you have any questions , concerns or updates.
Have a great day ahead :)
Thanks and Regards,
Kshitij Shukla
Software developer
*Cyber Infrastructure(CIS)
**/The RightSourcing Specialists with 1250 man years of experience!/*
DISCLAIMER: INFORMATION PRIVACY is important for us, If you are not the
intended recipient, you should delete this message and are notified that
any disclosure, copying or distribution of this message, or taking any
action based on it, is strictly prohibited by Law.
Please don't print this e-mail unless you really need to.
--
------------------------------
*Cyber Infrastructure (P) Limited, [CIS] **(CMMI Level 3 Certified)*
Central India's largest Technology company.
*Ensuring the success of our clients and partners through our highly
optimized Technology solutions.*
www.cisin.com | +Cisin <https://plus.google.com/+Cisin/> | Linkedin
<https://www.linkedin.com/company/cyber-infrastructure-private-limited> |
Offices: *Indore, India.* *Singapore. Silicon Valley, USA*.
DISCLAIMER: INFORMATION PRIVACY is important for us, If you are not the
intended recipient, you should delete this message and are notified that
any disclosure, copying or distribution of this message, or taking any
action based on it, is strictly prohibited by Law.