You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@nutch.apache.org by Alexander Baranov <Al...@epam.com> on 2015/04/29 16:17:15 UTC
TableNotFoundException during inject job
Hello, everybody.
I have rather strange behavior of Nutch 2.3: even initial Inject job is failing with the following exception (see below).
All Hadoop infrastructure is up and running:
root@5e7ca0b0c19d:~# jps
2810 NutchServer
1071 SecondaryNameNode
99 QuorumPeerMain
1694 ResourceManager
4598 Jps
795 NameNode
2243 HMaster
2376 HRegionServer
2669 ThriftServer
1789 NodeManager
913 DataNode
Even Nutch is configured correctly, because with the same configuration I was able to crawl some pages and see the data in Solr.
If I understand correctly, one of the goals on InjectorJob is to create 'webpage' table inside of HBase. Shell of HBase also shows 0 tables created.
Do you have any ideas what is wrong here and what should be done to fix this.
2015-04-29 13:23:58,978 INFO crawl.InjectorJob - InjectorJob: starting at 2015-04-29 13:23:58
2015-04-29 13:23:58,979 INFO crawl.InjectorJob - InjectorJob: Injecting urlDir: ram.txt
2015-04-29 13:24:01,434 ERROR store.HBaseStore - org.apache.hadoop.hbase.TableExistsException: webpage
2015-04-29 13:24:01,434 ERROR store.HBaseStore - [Ljava.lang.StackTraceElement;@6a19905e
2015-04-29 13:24:01,454 INFO crawl.InjectorJob - InjectorJob: Using class org.apache.gora.hbase.store.HBaseStore as the Gora storage class.
2015-04-29 13:24:01,520 WARN util.NativeCodeLoader - Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
2015-04-29 13:24:01,607 WARN snappy.LoadSnappy - Snappy native library not loaded
2015-04-29 13:24:02,501 ERROR store.HBaseStore - org.apache.hadoop.hbase.TableExistsException: webpage
2015-04-29 13:24:02,501 ERROR store.HBaseStore - [Ljava.lang.StackTraceElement;@523b3317
2015-04-29 13:24:02,813 INFO regex.RegexURLNormalizer - can't find rules for scope 'inject', using default
2015-04-29 13:24:02,986 WARN client.HConnectionManager$HConnectionImplementation - Encountered problems when prefetch META table:
org.apache.hadoop.hbase.TableNotFoundException: Cannot find row in .META. for table: webpage, row=webpage,,99999999999999
at org.apache.hadoop.hbase.client.MetaScanner.metaScan(MetaScanner.java:151)
at org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.prefetchRegionCache(HConnectionManager.java:1059)
at org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.locateRegionInMeta(HConnectionManager.java:1121)
at org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.locateRegion(HConnectionManager.java:1001)
at org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.locateRegion(HConnectionManager.java:958)
at org.apache.hadoop.hbase.client.HTable.finishSetup(HTable.java:251)
at org.apache.hadoop.hbase.client.HTable.<init>(HTable.java:155)
at org.apache.hadoop.hbase.client.HTable.<init>(HTable.java:129)
at org.apache.gora.hbase.store.HBaseTableConnection$1.<init>(HBaseTableConnection.java:87)
at org.apache.gora.hbase.store.HBaseTableConnection.getTable(HBaseTableConnection.java:87)
at org.apache.gora.hbase.store.HBaseTableConnection.put(HBaseTableConnection.java:186)
at org.apache.gora.hbase.store.HBaseStore.put(HBaseStore.java:260)
at org.apache.gora.hbase.store.HBaseStore.put(HBaseStore.java:79)
at org.apache.gora.mapreduce.GoraRecordWriter.write(GoraRecordWriter.java:65)
at org.apache.hadoop.mapred.MapTask$NewDirectOutputCollector.write(MapTask.java:638)
at org.apache.hadoop.mapreduce.TaskInputOutputContext.write(TaskInputOutputContext.java:80)
at org.apache.nutch.crawl.InjectorJob$UrlMapper.map(InjectorJob.java:188)
at org.apache.nutch.crawl.InjectorJob$UrlMapper.map(InjectorJob.java:82)
at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:145)
at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:764)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:364)
at org.apache.hadoop.mapred.LocalJobRunner$Job$MapTaskRunnable.run(LocalJobRunner.java:223)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
at java.util.concurrent.FutureTask.run(FutureTask.java:262)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:745)
2015-04-29 13:24:02,996 ERROR store.HBaseStore - webpage
2015-04-29 13:24:02,996 ERROR store.HBaseStore - [Ljava.lang.StackTraceElement;@f757c05
2015-04-29 13:24:03,009 WARN mapred.FileOutputCommitter - Output path is null in cleanup
2015-04-29 13:24:03,073 INFO crawl.InjectorJob - InjectorJob: total number of urls rejected by filters: 0
2015-04-29 13:24:03,073 INFO crawl.InjectorJob - InjectorJob: total number of urls injected after normalization and filtering: 1
2015-04-29 13:24:03,075 INFO crawl.InjectorJob - Injector: finished at 2015-04-29 13:24:03, elapsed: 00:00:04
Alexander Baranov