You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@nutch.apache.org by Nguyen Thi Ngoc Huong <hu...@gmail.com> on 2009/08/31 12:03:15 UTC

How to Inject urls to Hbase

This is my error when I try to Inject urls to HBase database (as inject urls
to crawlDb on Nutch)
I use Doğacan Güney<https://issues.apache.org/jira/secure/ViewProfile.jspa?name=dogacan>'s
code to implement the class "InjectorHbase", in the class
InjectorHbaseMapper this is Map funtion

public void map(LongWritable key, Text value,
                OutputCollector<Text, Text> output, Reporter reporter)
                throws IOException {
            System.out.println("Vao map");
            if (table == null) {
                System.out.println("Table == null");
                throw new IOException("Can not connect to hbase table");
            }
            String url = value.toString();
            String reversedUrl;
            try {
                url = urlNormalizers
                        .normalize(url, URLNormalizers.SCOPE_INJECT);
                url = filters.filter(url);
                if (url == null) {
                    return;
                }
                reversedUrl = TableUtil.reverseUrl(url);
            } catch (Exception e) {
                LOG.warn("Skipping " + url + ":" + e);
                return;
            }

            BatchUpdate bu = new BatchUpdate(reversedUrl);
            bu.put(META_INJECT_KEY, TableUtil.YES_VAL);

            table.commit(bu);
        }
so when run the program, there is error as
Caused by: java.lang.RuntimeException: java.lang.ClassNotFoundException:
org.apache.nutch.crawl.InjectorHbase$InjectorHbaseMapper
    at org.apache.hadoop.conf.Configuration.getClass(Configuration.java:720)
    at org.apache.hadoop.conf.Configuration.getClass(Configuration.java:744)
I don't know the reason because I have class InjectorHbaseMapper there,
another way, I can't debug on Map/Reduce function by println in Map/Reduce
function although I configured
<property>
  <name>mapred.job.tracker</name>
<value>local</value>
</property>
on hadoop-site.xml.

-- 
Nguyễn Thị Ngọc Hương