You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@nutch.apache.org by Nguyen Thi Ngoc Huong <hu...@gmail.com> on 2009/08/31 12:03:15 UTC
How to Inject urls to Hbase
This is my error when I try to Inject urls to HBase database (as inject urls
to crawlDb on Nutch)
I use Doğacan Güney<https://issues.apache.org/jira/secure/ViewProfile.jspa?name=dogacan>'s
code to implement the class "InjectorHbase", in the class
InjectorHbaseMapper this is Map funtion
public void map(LongWritable key, Text value,
OutputCollector<Text, Text> output, Reporter reporter)
throws IOException {
System.out.println("Vao map");
if (table == null) {
System.out.println("Table == null");
throw new IOException("Can not connect to hbase table");
}
String url = value.toString();
String reversedUrl;
try {
url = urlNormalizers
.normalize(url, URLNormalizers.SCOPE_INJECT);
url = filters.filter(url);
if (url == null) {
return;
}
reversedUrl = TableUtil.reverseUrl(url);
} catch (Exception e) {
LOG.warn("Skipping " + url + ":" + e);
return;
}
BatchUpdate bu = new BatchUpdate(reversedUrl);
bu.put(META_INJECT_KEY, TableUtil.YES_VAL);
table.commit(bu);
}
so when run the program, there is error as
Caused by: java.lang.RuntimeException: java.lang.ClassNotFoundException:
org.apache.nutch.crawl.InjectorHbase$InjectorHbaseMapper
at org.apache.hadoop.conf.Configuration.getClass(Configuration.java:720)
at org.apache.hadoop.conf.Configuration.getClass(Configuration.java:744)
I don't know the reason because I have class InjectorHbaseMapper there,
another way, I can't debug on Map/Reduce function by println in Map/Reduce
function although I configured
<property>
<name>mapred.job.tracker</name>
<value>local</value>
</property>
on hadoop-site.xml.
--
Nguyễn Thị Ngọc Hương